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Foreword 


Beginning in the spring of 2000, a series of four one-semester courses 
were taught at Princeton University whose purpose was to present, in 
an integrated manner, the core areas of analysis. The objective was to 
make plain the organic unity that exists between the various parts of the 
subject, and to illustrate the wide applicability of ideas of analysis to 
other fields of mathematics and science. The present series of books is 
an elaboration of the lectures that were given. 

While there are a number of excellent texts dealing with individual 
parts of what we cover, our exposition aims at a different goal: pre- 
senting the various sub-areas of analysis not as separate disciplines, but 
rather as highly interconnected. It is our view that seeing these relations 
and their resulting synergies will motivate the reader to attain a better 
understanding of the subject as a whole. With this outcome in mind, we 
have concentrated on the main ideas and theorems that have shaped the 
field (sometimes sacrificing a more systematic approach), and we have 
been sensitive to the historical order in which the logic of the subject 
developed. 

We have organized our exposition into four volumes, each reflecting 
the material covered in a semester. Their contents may be broadly sum- 
marized as follows: 

I. Fourier series and integrals. 

II. Complex analysis. 

III. Measure theory, Lebesgue integration, and Hilbert spaces. 

IV. A selection of further topics, including functional analysis, distri- 
butions, and elements of probability theory. 

However, this listing does not by itself give a complete picture of 
the many interconnections that are presented, nor of the applications 
to other branches that are highlighted. To give a few examples: the ele- 
ments of (finite) Fourier series studied in Book I, which lead to Dirichlet 
characters, and from there to the infinitude of primes in an arithmetic 
progression; the A-ray and Radon transforms, which arise in a number of 
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problems in Book I, and reappear in Book III to play an important role in 
understanding Besicovitcli-like sets in two and three dimensions; Baton’s 
theorem, which guarantees the existence of boundary values of bounded 
holomorphic functions in the disc, and whose proof relies on ideas devel- 
oped in each of the first three books; and the theta function, which first 
occurs in Book I in the solution of the heat equation, and is then used 
in Book II to find the number of ways an integer can be represented as 
the sum of two or four squares, and in the analytic continuation of the 
zeta function. 

A few further words about the books and the courses on which they 
were based. These courses where given at a rather intensive pace, with 48 
lecture-hours a semester. The weekly problem sets played an indispens- 
able part, and as a result exercises and problems have a similarly im- 
portant role in our books. Each chapter has a series of “Exercises” that 
are tied directly to the text, and while some are easy, others may require 
more effort. However, the substantial number of hints that are given 
should enable the reader to attack most exercises. There are also more 
involved and challenging “Problems”; the ones that are most difficult, or 
go beyond the scope of the text, are marked with an asterisk. 

Despite the substantial connections that exist between the different 
volumes, enough overlapping material has been provided so that each of 
the first three books requires only minimal prerequisites: acquaintance 
with elementary topics in analysis such as limits, series, differentiable 
functions, and Riemann integration, together with some exposure to lin- 
ear algebra. This makes these books accessible to students interested 
in such diverse disciplines as mathematics, physics, engineering, and 
finance, at both the undergraduate and graduate level. 

It is with great pleasure that we express our appreciation to all who 
have aided in this enterprise. We are particularly grateful to the stu- 
dents who participated in the four courses. Their continuing interest, 
enthusiasm, and dedication provided the encouragement that made this 
project possible. We also wish to thank Adrian Banner and Jose Luis 
Rodrigo for their special help in running the courses, and their efforts to 
see that the students got the most from each class. In addition, Adrian 
Banner also made valuable suggestions that are incorporated in the text. 
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We wish also to record a note of special thanks for the following in- 
dividuals: Charles Fefferman, who taught the first week (successfully 
launching the whole project!); Paul Hagelstein, who in addition to read- 
ing part of the manuscript taught several weeks of one of the courses, and 
has since taken over the teaching of the second round of the series; and 
Daniel Levine, who gave valuable help in proof-reading. Last but not 
least, our thanks go to Gerree Pecht, for her consummate skill in type- 
setting and for the time and energy she spent in the preparation of all 
aspects of the lectures, such as transparencies, notes, and the manuscript. 

We are also happy to acknowledge our indebtedness for the support 
we received from the 250th Anniversary Fund of Princeton University, 
and the National Science Foundation’s VIGRE program. 


Elias M. Stein 
Rami Shakarchi 

Princeton, New Jersey 
August 2002 


In this third volume we establish the basic facts concerning measure 
theory and integration. This allows us to reexamine and develop further 
several important topics that arose in the previous volumes, as well as to 
introduce a number of other subjects of substantial interest in analysis. 
To aid the interested reader, we have starred sections that contain more 
advanced material. These can be omitted on first reading. We also want 
to take this opportunity to thank Daniel Levine for his continuing help in 
proof-reading and the many suggestions he made that are incorporated 
in the text. 
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Introduction 


I turn away in fright and horror from this lamentable 
plague of functions that do not have derivatives. 

C. Hermite, 1893 


Starting in about 1870 a revolutionary change in the conceptual frame- 
work of analysis began to take shape, one that ultimately led to a vast 
transformation and generalization of the understanding of such basic ob- 
jects as functions, and such notions as continuity, differentiability, and 
integrability. 

The earlier view that the relevant functions in analysis were given by 
formulas or other “analytic” expressions, that these functions were by 
their nature continuous (or nearly so), that by necessity such functions 
had derivatives for most points, and moreover these were integrable by 
the accepted methods of integration — all of these ideas began to give 
way under the weight of various examples and problems that arose in 
the subject, which could not be ignored and required new concepts to 
be understood. Parallel with these developments came new insights that 
were at once both more geometric and more abstract: a clearer under- 
standing of the nature of curves, their rectifiability and their extent; also 
the beginnings of the theory of sets, starting with subsets of the line, the 
plane, etc., and the “measure” that could be assigned to each. 

That is not to say that there was not considerable resistance to the 
change of point-of-view that these advances required. Paradoxically, 
some of the leading mathematicians of the time, those who should have 
been best able to appreciate the new departures, were among the ones 
who were most skeptical. That the new ideas ultimately won out can 
be understood in terms of the many questions that could now be ad- 
dressed. We shall describe here, somewhat imprecisely, several of the 
most significant such problems. 
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INTRODUCTION 


1 Fourier series: completion 

Whenever / is a (Riemann) integrable function on [— 7r,7r] we defined in 
Book I its Fourier series f ^ “ne*""® by 

( 1 ) = f{x)e-^^^dx, 


and saw then that one had Parseval’s identity, 


E 



n= — oo 


— / |/(x)pdx. 


However, the above relationship between functions and their Fourier 
coefficients is not completely reciprocal when limited to Riemann inte- 
grable functions. Thus if we consider the space TZ of such functions with 
its square norm, and the space with its norm,^ each element / in 

TZ assigns a corresponding element {a„} in £^(Z), and the two norms are 
identical. However, it is easy to construct elements in £^(Z) that do not 
correspond to functions in TZ. Note also that the space is complete 
in its norm, while TZ is not.^ Thus we are led to two questions: 

(i) What are the putative “functions” / that arise when we complete 

TZ? In other words: given an arbitrary sequence {an} G what 

is the nature of the (presumed) function / corresponding to these 
coefficients? 

(ii) How do we integrate such functions / (and in particular verify (1))? 

2 Limits of continuous functions 

Suppose {fn} is a sequence of continuous functions on [0, 1]. We assume 
that lim„^oo fn{x) = f{x) exists for every x, and inquire as to the nature 
of the limiting function /. 

If we suppose that the convergence is uniform, matters are straight- 
forward and / is then everywhere continuous. However, once we drop 
the assumption of uniform convergence, things may change radically and 
the issues that arise can be quite subtle. An example of this is given by 
the fact that one can construct a sequence of continuous functions {fn} 
converging everywhere to / so that 


^We use the notation of Chapter 3 in Book I. 

^See the discussion surrounding Theorem 1.1 in Section 1, Chapter 3 of Book I. 
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(a) 0 < /n(a;) < 1 for all x. 

(b) The sequence fn{x) is montonically decreasing as n — > oo. 

(c) The limiting function / is not Riemann integrable.^ 

However, in view of (a) and (b), the sequence fn{x)dx converges to 
a limit. So it is natural to ask: what method of integration can be used 
to integrate / and obtain that for it 

[ f{x)dx= lim [ fn{x)dx? 

Jo Jo 

It is with Lebesgue integration that we can solve both this problem 
and the previous one. 

3 Length of curves 

The study of curves in the plane and the calculation of their lengths 
are among the first issues dealt with when one learns calculus. Suppose 
we consider a continuous curve T in the plane, given parametrically by 
r = {{x{t),y{t))}, a < t < b, with x and y continuous functions of t. We 
define the length of T in the usual way: as the supremum of the lengths 
of all polygonal lines joining successively finitely many points of T, taken 
in order of increasing t. We say that T is rectifiable if its length L is 
finite. When x{t) and y{t) are continuously differentiable we have the 
well-known formula, 

(2) L= f {{x'{t)f + {y'{t)fy^^ dt. 

J a 

The problems we are led to arise when we consider general curves. 
More specifically, we can ask: 

(i) What are the conditions on the functions x{t) and y{t) that guar- 
antee the rectiff ability of T? 

(ii) When these are satisfied, does the formula (2) hold? 

The first question has a complete answer in terms of the notion of func- 
tions of “bounded variation.” As to the second, it turns out that if x and 
y are of bounded variation, the integral (2) is always meaningful; how- 
ever, the equality fails in general, but can be restored under appropriate 
reparametrization of the curve T. 


^The limit / can be highly discontinuous. See, for instance, Exercise 10 in Chapter 1. 
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There are further issues that arise. Rectifiable curves, because they 
are endowed with length, are genuinely one-dimensional in nature. Are 
there (non-rectifiable) curves that are two-dimensional? We shall see 
that, indeed, there are continuous curves in the plane that fill a square, 
or more generally have any dimension between 1 and 2, if the notion of 
fractional dimension is appropriately defined. 

4 Differentiation and integration 

The so-called “fundamental theorem of the calculus” expresses the fact 
that differentiation and integration are inverse operations, and this can 
be stated in two different ways, which we abbreviate as follows: 





(3) 



For the first assertion, the existence of continuous functions F that are 
nowhere differentiable, or for which F'{x) exists for every x, but F' is 
not integrable, leads to the problem of finding a general class of the F for 
which (3) is valid. As for (4), the question is to formulate properly and 
establish this assertion for the general class of integrable functions / that 
arise in the solution of the first two problems considered above. These 
questions can be answered with the help of certain “covering” arguments, 
and the notion of absolute continuity. 

5 The problem of measure 

To put matters clearly, the fundamental issue that must be understood 
in order to try to answer all the questions raised above is the problem 
of measure. Stated (imprecisely) in its version in two dimensions, it 
is the problem of assigning to each subset E of its two-dimensional 
measure m 2 {E), that is, its “area,” extending the standard notion defined 
for elementary sets. Let us instead state more precisely the analogous 
problem in one dimension, that of constructing one-dimensional measure 
mi = m, which generalizes the notion of length in R. 

We are looking for a non-negative function m defined on the family of 
subsets E ofR that we allow to be extended-valued, that is, to take on 
the value -l-oo. We require: 
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(a) m{E) = 6 — a if -B is the interval [a, 6] , a < 6, of length b — a. 

(b) m{E) = whenever E = U^i and the sets are 

disjoint. 

Condition (b) is the “countable additivity” of the measure m. It implies 
the special case: 

(b') m{Ei U E 2 ) = m{Ei) + m{E 2 ) if Ei and E 2 are disjoint. 

However, to apply the many limiting arguments that arise in the theory 
the general case (b) is indispensable, and (b') by itself would definitely 
be inadequate. 

To the axioms (a) and (b) one adds the translation-invariance of m, 
namely 

(c) m{E + h) = m{E), for every h S R. 

A basic result of the theory is the existence (and uniqueness) of such 
a measure, Lebesgue measure, when one limits oneself to a class of rea- 
sonable sets, those which are “measurable.” This class of sets is closed 
under countable unions, intersections, and complements, and contains 
the open sets, the closed sets, and so forth.^ 

It is with the construction of this measure that we begin our study. 
From it will flow the general theory of integration, and in particular the 
solutions of the problems discussed above. 

A chronology 

We conclude this introduction by listing some of the signal events that 
marked the early development of the subject. 

1872 — Weierstrass’s construction of a nowhere differentiable function. 

1881 — Introduction of functions of bounded variation by Jordan and 
later (1887) connection with rectifiability. 

1883 — Cantor’s ternary set. 

1890 — Construction of a space-filling curve by Peano. 

1898 — Borel’s measurable sets. 

1902 — Lebesgue’s theory of measure and integration. 

1905 — Construction of non-measurable sets by Vitali. 

1906 — Fatou’s application of Lebesgue theory to complex analysis. 


^There is no such measure on the class of all subsets, since there exist non-measurable 
sets. See the construction of such a set at the end of Section 3, Chapter 1. 



Measure Theory 


The sets whose measure we can define by virtue of the 
preceding ideas we will call measurable sets; we do 
this without intending to imply that it is not possible 
to assign a measure to other sets. 

E. Borel, 1898 


This chapter is devoted to the construction of Lebesgue measure in 
and the study of the resulting class of measurable functions. After some 
preliminaries we pass to the first important definition, that of exterior 
measure for any subset E of This is given in terms of approximations 
by unions of cubes that cover E. With this notion in hand we can 
define measurability and thus restrict consideration to those sets that 
are measurable. We then turn to the fundamental result: the collection 
of measurable sets is closed under complements and countable unions, 
and the measure is additive if the subsets in the union are disjoint. 

The concept of measurable functions is a natural outgrowth of the 
idea of measurable sets. It stands in the same relation as the concept 
of continuous functions does to open (or closed) sets. But it has the 
important advantage that the class of measurable functions is closed 
under pointwise limits. 


1 Preliminaries 

We begin by discussing some elementary concepts which are basic to the 
theory developed below. 

The main idea in calculating the “volume” or “measure” of a subset 
of R'^ consists of approximating this set by unions of other sets whose 
geometry is simple and whose volumes are known. It is convenient to 
speak of “volume” when referring to sets in R'^; but in reality it means 
“area” in the case d= 2 and “length” in the case d = 1. In the approach 
given here we shall use rectangles and cubes as the main building blocks 
of the theory: in R we use intervals, while in R*^ we take products of 
intervals. In all dimensions rectangles are easy to manipulate and have 
a standard notion of volume that is given by taking the product of the 
length of all sides. 
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Next, we prove two simple theorems that highlight the importance of 
these rectangles in the geometry of open sets: in R every open set is a 
countable union of disjoint open intervals, while in d > 2, every open 
set is “almost” the disjoint union of closed cubes, in the sense that only 
the boundaries of the cubes can overlap. These two theorems motivate 
the definition of exterior measure given later. 

We shall use the following standard notation. A point a: G consists 
of a d-tuple of real numbers 

X = {xi,X2 , . . . , Xd), Xi G M, for i = 1, . . . , d. 

Addition of points is componentwise, and so is multiplication by a real 
scalar. The norm of x is denoted by |x| and is defined to be the standard 
Euclidean norm given by 

|a:| = {xl + --- + xlY^‘^ . 

The distance between two points x and y is then simply \x — y\. 

The complement of a set E in is denoted by and defined by 

E’^ = {x ■. X i E}. 

If E and E are two subsets of R'^, we denote the complement of F in E 
by 

E — E = {x •. X ^ E and x ^ E). 

The distance between two sets E and E is defined by 

d(F, E) = inf \x — y\^ 

where the infimum is taken over all a: G F and y Q E. 

Open, closed, and compact sets 

The open ball in R'^ centered at x and of radius r is defined by 
Br{x) = {y G R'^ : |y — a:| < r}. 

A subset F C R^^ is open if for every x £ E there exists r > 0 with 
Br-{x) C F. By definition, a set is closed if its complement is open. 

We note that any (not necessarily countable) union of open sets is 
open, while in general the intersection of only finitely many open sets 
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is open. A similar statement holds for the class of closed sets, if one 
interchanges the roles of unions and intersections. 

A set E is bounded if it is contained in some ball of finite radius. 
A bounded set is compact if it is also closed. Compact sets enjoy the 
Heine-Borel covering property: 

• Assume E is compact, E C [j^Oa, and each Oa is open. Then 
there are finitely many of the open sets, Oai,Oa 2 , ■ ■ ■ , Oaj^, such 
that£;cU7=i(^a,- 

In words, any covering of a compact set by a collection of open sets 
contains a finite subcovering. 

A point a; € is a limit point of the set E if for every r > 0, the ball 
Br{x) contains points of E. This means that there are points in E which 
are arbitrarily close to x. An isolated point of ill is a point x ^ E such 
that there exists an r > 0 where Br{x) H E is equal to {x}. 

A point a: € iH is an interior point of E if there exists r > 0 such 
that Br{x) C E. The set of all interior points of E is called the interior 
of E. Also, the closure E of the E consists of the union of E and all 
its limit points. The boundary of a set E, denoted by dE, is the set of 
points which are in the closure of E but not in the interior of E. 

Note that the closure of a set is a closed set; every point in is a 
limit point of E] and a set is closed if and only if it contains all its limit 
points. Finally, a closed set E is perfect if E does not have any isolated 
points. 

Rectangles and cubes 

A (closed) rectangle R in is given by the product of d one-dimensional 
closed and bounded intervals 


R= [ai,6i] X [02,62] X ■■■ X [ad,bd\, 

where aj < bj are real numbers, j = 1, 2, . . . , d. In other words, we have 

R = {(xi, . . . , Xd) e : Oj < Xj < bj for all j = 1,2, . . . ,d}. 

We remark that in our definition, a rectangle is closed and has sides 
parallel to the coordinate axis. In R, the rectangles are precisely the 
closed and bounded intervals, while in R^ they are the usual four-sided 
rectangles. In R^ they are the closed parallelepipeds. 

We say that the lengths of the sides of the rectangle R are 61 — 
ai, . . . ,bd — Od- The volume of the rectangle R is denoted by |i?|, and 
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^ ^ 

M 



Figure 1. Rectangles in d = 1,2,3 


is defined to be 


\R\ = {bi - oi) ■ ■ ■ {bd - ad). 

Of course, when d = 1 the “volume” equals length, and when d = 2 it 
equals area. 

An open rectangle is the product of open intervals, and the interior of 
the rectangle R is then 


(ai, 6 i) X {a2,b2) x ■ ■ ■ x {ad,bd). 


Also, a cube is a rectangle for which bi — = b 2 — a 2 = ■ ■ ■ = bd — ad- 

So if <5 C is a cube of common side length £, then \Q\= 

A union of rectangles is said to be almost disjoint if the interiors of 
the rectangles are disjoint. 

In this chapter, coverings by rectangles and cubes play a major role, 
so we isolate here two important lemmas. 

Lemma 1.1 If a rectangle is the almost disjoint union of finitely many 
other rectangles, say R = IJ^i ^k, then 

N 

\R\ = Y.\Rk\. 

k=l 
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Proof. We consider the grid formed by extending indefinitely the 
sides of all rectangles Ri, ... ,Rn. This construction yields finitely many 
rectangles Ri, ... , Rm, and a partition Ji, ... , Jjv of the integers between 
1 and M, such that the unions 

M 

R = \^ Rj and Rk = ! for fc = 1, . . . , 

f=i ieJfc 

are almost disjoint (see the illustration in Figure 2). 


R 








Rm 

























Ri 

R2 






Figure 2. The grid formed by the rectangles Rk 


For the rectangle R, for example, we see that |i?| = since 

the grid actually partitions the sides of R and each Rj consists of taking 
products of the intervals in these partitions. Thus when adding the 
volumes of the Rj we are summing the corresponding products of lengths 
of the intervals that arise. Since this also holds for the other rectangles 
Ri, . . . , Rn , we conclude that 

M N N 

j=i k=ijeJk k=i 


A slight modification of this argument then yields the following: 
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Lemma 1.2 If R, Ri, . . . , Rjs[ are rectangles, and R C U^i ^k, then 

N 

k=l 

The main idea consists of taking the grid formed by extending all sides 
of the rectangles R, Ri, . . . , Rn, and noting that the sets corresponding 
to the Jk (in the above proof) need not be disjoint any more. 

We now proceed to give a description of the structure of open sets in 
terms of cubes. We begin with the case of M. 

Theorem 1.3 Every open subset O of M. can be writen uniquely as a 
countable union of disjoint open intervals. 

Proof For each x & O, let denote the largest open interval contain- 
ing X and contained in O. More precisely, since O is open, x is contained 
in some small (non-trivial) interval, and therefore if 

Ox = inf{a < X : (a, x) C O} and bx = sup{5 > x : {x,b) C O} 

we must have Ox < x < bx (with possibly infinite values for Qx and bx). 
If we now let R = {ox, bx), then by construction we have x £ R &s well 
as R C O. Hence 

0=1) R. 

xeo 

Now suppose that two intervals R and R intersect. Then their union 
(which is also an open interval) is contained in O and contains x. Since 
R is maximal, we must have {Ix U R) C R, and similarly {R U R) C R. 
This can happen only if R = R; therefore, any two distinct intervals in 
the collection X = {R}xeO must be disjoint. The proof will be complete 
once we have shown that there are only countably many distinct intervals 
in the collection X. This, however, is easy to see, since every open interval 
R contains a rational number. Since different intervals are disjoint, they 
must contain distinct rationals, and therefore X is countable, as desired. 

Naturally, if O is open and O = where the Ij’s are disjoint 

open intervals, the measure of O ought to be Since this rep- 

resentation is unique, we could take this as a definition of measure; we 
would then note that whenever Oi and O 2 are open and disjoint, the mea- 
sure of their union is the sum of their measures. Although this provides 
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a natural notion of measure for an open set, it is not immediately clear 
how to generalize it to other sets in M. Moreover, a similar approach in 
higher dimensions already encounters complications even when defining 
measures of open sets, since in this context the direct analogue of The- 
orem 1.3 is not valid (see Exercise 12). There is, however, a substitute 
result. 

Theorem 1.4 Every open subset O of d > 1, can be written as a 
countable union of almost disjoint closed cubes. 

Proof We must construct a countable collection Q of closed cubes 
whose interiors are disjoint, and so that O = UgeQ Q- 

As a first step, consider the grid in formed by taking all closed cubes 
of side length 1 whose vertices have integer coordinates. In other words, 
we consider the natural grid of lines parallel to the axes, that is, the grid 
generated by the lattice We shall also use the grids formed by cubes 
of side length 2^^ obtained by successively bisecting the original grid. 

We either accept or reject cubes in the initial grid as part of Q accord- 
ing to the following rule: if Q is entirely contained in O then we accept 
Q] if Q intersects both O and then we tentatively accept it; and if Q 
is entirely contained in 0‘^ then we reject it. 

As a second step, we bisect the tentatively accepted cubes into 2'^ cubes 
with side length 1/2. We then repeat our procedure, by accepting the 
smaller cubes if they are completely contained in O, tentatively accepting 
them if they intersect both O and and rejecting them if they are 
contained in O'^. Figure 3 illustrates these steps for an open set in R^. 




Step 2 


Figure 3. Decomposition of O into almost disjoint cubes 
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This procedure is then repeated indefinitely, and (by construction) 
the resulting collection Q of all accepted cubes is countable and consists 
of almost disjoint cubes. To see why their union is all of O, we note 
that given x e (D there exists a cube of side length 2~^ (obtained from 
successive bisections of the original grid) that contains x and that is 
entirely contained in O. Either this cube has been accepted, or it is 
contained in a cube that has been previously accepted. This shows that 
the union of all cubes in Q covers O. 

Once again, if O = U^i where the rectangles Rj are almost dis- 
joint, it is reasonable to assign to O the measure This is 

natural since the volume of the boundary of each rectangle should be 0, 
and the overlap of the rectangles should not contribute to the volume 
of O. We note, however, that the above decomposition into cubes is 
not unique, and it is not immediate that the sum is independent of this 
decomposition. So in with d > 2, the notion of volume or area, even 
for open sets, is more subtle. 

The general theory developed in the next section actually yields a 
notion of volume that is consistent with the decompositions of open sets 
of the previous two theorems, and applies to all dimensions. Before we 
come to that, we discuss an important example in M. 


The Cantor set 

The Cantor set plays a prominent role in set theory and in analysis in 
general. It and its variants provide a rich source of enlightening examples. 

We begin with the closed unit interval Cq = [0, 1] and let Ci denote 
the set obtained from deleting the middle third open interval from [0, 1], 
that is. 


Cl = [0, 1/3] U [2/3,1]. 

Next, we repeat this procedure for each sub-interval of Ci; that is, we 
delete the middle third open interval. At the second stage we get 

Ca = [0, 1/9] U [2/9, 1/3] U [2/3, 7/9] U [8/9, 1]. 

We repeat this process for each sub-interval of Ca, and so on (Figure 4). 

This procedure yields a sequence Ck, k = 0,1,2, .. . of compact sets 
with 


Co D Cl D C2 D ■ ■ ■ D Ck D Ck+i D ■■■ . 
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Co 


1 

0 


1 

1 

Cl 

1 



1 

0 

1/3 

1 

2/3 1 

C2 

1 



0 V9 

2/9 1/3 

1 

2/3 ^/9 8/9 1 

C3 

1_ 


_| 

r 



Figure 4. 

Construction of the Cantor set 


The Cantor set C is by definition the intersection of all Ck’s: 

OO 

C=f]Ck. 

k=0 

The set C is not empty, since all end-points of the intervals in Ck (all k) 
belong to C. 

Despite its simple construction, the Cantor set enjoys many interest- 
ing topological and analytical properties. For instance, C is closed and 
bounded, hence compact. Also, C is totally disconnected: given any 
x,y there exists z ^ C that lies between x and y. Finally, C is per- 
fect: it has no isolated points (Exercise 1). 

Next, we turn our attention to the question of determining the “size” 
of C. This is a delicate problem, one that may be approached from 
different angles depending on the notion of size we adopt. For instance, 
in terms of cardinality the Cantor set is rather large: it is not countable. 
Since it can be mapped to the interval [0,1], the Cantor set has the 
cardinality of the continuum (Exercise 2). 

However, from the point of view of “length” the size of C is small. 
Roughly speaking, the Cantor set has length zero, and this follows from 
the following intuitive argument: the set C is covered by sets Ck whose 
lengths go to zero. Indeed, Ck is a disjoint union of 2^ intervals of length 
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3“^, making the total length of Ck equal to (2/3)^. But C C Ck for all 
k, and (2/3)^ — > 0 as A: tends to infinity. We shall define a notion of 
measure and make this argument precise in the next section. 


2 The exterior measure 

The notion of exterior measure is the first of two important concepts 
needed to develop a theory of measure. We begin with the definition and 
basic properties of exterior measure. Loosely speaking, the exterior mea- 
sure m* assigns to any subset of a first notion of size; various examples 
show that this notion coincides with our earlier intuition. However, the 
exterior measure lacks the desirable property of additivity when taking 
the union of disjoint sets. We remedy this problem in the next section, 
where we discuss in detail the other key concept of measure theory, the 
notion of measurable sets. 

The exterior measure, as the name indicates, attempts to describe 
the volume of a set E by approximating it from the outside. The set 
E is covered by cubes, and if the covering gets finer, with fewer cubes 
overlapping, the volume of E should be close to the sum of the volumes 
of the cubes. 

The precise definition is as follows: if E is any subset of the 
exterior measure^ of E is 

OO 

( 1 ) Tn^{E) = 

3 = 1 

where the infimum is taken over all countable coverings E C U^i Qj W 
closed cubes. The exterior measure is always non-negative but could be 
infinite, so that in general we have 0 < m^,{E) < oo, and therefore takes 
values in the extended positive numbers. 

We make some preliminary remarks about the definition of the exterior 
measure given by (1). 

(i) It is important to note that it would not suffice to allow finite sums 
in the definition of m9f{E). The quantity that would be obtained if one 
considered only coverings of E by finite unions of cubes is in general 
larger than m^,{E). (See Exercise 14.) 

(ii) One can, however, replace the coverings by cubes, with coverings 
by rectangles; or with coverings by balls. That the former alternative 


^Some authors use the term outer measure instead of exterior measure. 
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yields the same exterior measure is quite direct. (See Exercise 15.) The 
equivalence with the latter is more subtle. (See Exercise 26 in Chapter 3.) 

We begin our investigation of this new notion by providing examples 
of sets whose exterior measures can be calculated, and we check that 
the latter matches our intuitive idea of volume (length in one dimension, 
area in two dimensions, etc.) 

Example 1. The exterior measure of a point is zero. This is clear once 
we observe that a point is a cube with volume zero, and which covers 
itself. Of course the exterior measure of the empty set is also zero. 

Example 2. The exterior measure of a closed cube is equal to its volume. 
Indeed, suppose Q is a closed cube in Since Q covers itself, we must 
have m^:{Q) < \Q\. Therefore, it suffices to prove the reverse inequality. 

We consider an arbitrary covering Q C U^i Qj by cubes, and note 
that it suffices to prove that 


OO 

( 2 ) \Q\<J2m. 

i=i 

For a fixed e > 0 we choose for each j an open cube Sj which contains Qj , 
and such that \Sj\ < {1 + e)\Qj\. From the open covering IJ^i *be 
compact set Q, we may select a finite subcovering which, after possibly 
renumbering the rectangles, we may write as Q C Uj=i Taking the 
closure of the cubes Sj, we may apply Lemma 1.2 to conclude that \Q\ < 
'^f=i Consequently, 


N OO 

|Q| < (1 + e) ^ \ Qj\ < (1 + e) ^ \ Qj\. 

j=i j=i 

Since e is arbitrary, we find that the inequality (2) holds; thus \Q\ < 
as desired. 

Example 3. If Q is an open cube, the result m^,{Q) = \Q\ still holds. 
Since Q is covered by its closure Q, and \Q\ = \Q\, we immediately see 
that m^:[Q) < jQI- To prove the reverse inequality, we note that if Qq is 
a closed cube contained in Q, then since any covering 

of (5 by a countable number of closed cubes is also a covering of Qq (see 
Observation 1 below). Hence IQol < ™*(Q), and since we can choose Qo 
with a volume as close as we wish to \Q\, we must have \Q\ < m^,{Q). 
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Example 4. The exterior measure of a rectangle R is equal to its volume. 
Indeed, arguing as in Example 2, we see that |i?| < To obtain the 

reverse inequality, consider a grid in formed by cubes of side length 
1/k. Then, if Q consists of the (finite) collection of all cubes entirely 
contained in R, and Q' the (finite) collection of all cubes that intersect 
the complement of R, we first note that R C Uq6(quQ') Q- ^ simple 
argument yields 


QeQ 

Moreover, there are cubes^ in Q', and these cubes have volume 

k~'^, so that YliQeQ' 1*^1 ~ 0{l/k). Hence 

|Q|<|i?| + 0(l/fc), 

Qe(QuQ') 

and letting k tend to infinity yields m^,{R) < |i?|, as desired. 

Example 5. The exterior measure of is infinite. This follows from 
the fact that any covering of is also a covering of any cube Q 
hence \Q\ < m*(IR.'^). Since Q can have arbitrarily large volume, we must 
have m*(IR‘^) = oo. 

Example 6. The Cantor set C has exterior measure 0. From the con- 
struction of C, we know that C C Ck, where each Ck is a disjoint union 
of 2^ closed intervals, each of length 3“^. Consequently, < (2/3)^ 

for all fc, hence m^,{C) = 0. 


Properties of the exterior measure 

The previous examples and comments provide some intuition underlying 
the definition of exterior measure. Here, we turn to the further study of 
m* and prove five properties of exterior measure that are needed in what 
follows. 

First, we record the following remark that is immediate from the def- 
inition of m*: 


^We remind the reader of the notation f{x) = 0(g{x)), which means that |/(3;)| < 
C\g{x)\ for some constant C and all a: in a given range. In this particular example, there 
are fewer than Ck^~^ cubes in question, as k ^ oo. 
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• For every e > 0, there exists a covering E C U^i Qj with 


+ e. 
i=i 


The relevant properties of exterior measure are listed in a series of 
observations. 

Observation 1 (Monotonicity) If Ei C E 2 , then m^,{Ei) < m^,(E 2 ). 

This follows once we observe that any covering of E 2 by a countable 
collection of cubes is also a covering of Ei . 

In particular, monotonicity implies that every bounded subset of 
has finite exterior measure. 

Observation 2 (Countable sub-additivity) IfE = \JT=i^E then 
m^{E) < 

First, we may assume that each m^,{Ej) < 00 , for otherwise the in- 
equality clearly holds. For any e > 0, the definition of the exterior mea- 
sure yields for each j a covering Ej C U^i Qk,j by closed cubes with 

00 

y ^ — n^^,{Ej) + — . 

k=l 

Then, E C U^fe=i Qk,j is a covering of E by closed cubes, and therefore 


m^{E) < '^\Qk,j\ = '^'^\Qk,j\ 

j,k j = ^ k=l 

CxD 

CXD 

= J2^*iEj) + e. 

1=1 

Since this holds true for every e > 0, the second observation is proved. 

Observations If E C^'^, then m*(i?) = inf m*(C>), where the infi- 
mum is taken over all open sets O containing E. 
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By monotonicity, it is clear that the inequality m^,{E) < inf m*(C>) 
holds. For the reverse inequality, let e > 0 and choose cubes Qj such 
that E C U^i Qe 


'^\Qj\ < m^{E) + 

i=i 


Let denote an open cube containing Qj, and such that \Q^\ < \Qj \ + 
£/2J+i. Then O = U^i Qj is open, and by Observation 2 


m*(C>) < ^ |Q°| 

j=l j=l 

oo 

oo 

^ X/ 2 

i=i 

< m^,{E) + e. 

Hence inf mt:{ 0 ) < m^,{E), as was to be shown. 


Observation 4 If E = EiU E2, and d{Ei,E2) > 0, then 




By Observation 2, we already know that m^,{E) < + m^,{E2), 

so it suffices to prove the reverse inequality. To this end, we first select 6 
such that d{Ei,E2) > 5 > 0 . Next, we choose a covering E C U^i Qj by 
closed cubes, with \ Qj\ — We may, after subdividing 

the cubes Qj, assume that each Qj has a diameter less than 6 . In this 
case, each Qj can intersect at most one of the two sets Ei or E2. If we 
denote by Ji and J2 the sets of those indices j for which Qj intersects 
El and E2, respectively, then Ji O J2 is empty, and we have 


OO OO 

O Qj as well as E2 O Qj. 
jeJi jeJ2 
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Therefore, 


m^,{Ei) + m*(^2) < X] IQjl + X] \Qj\ 

jeJi jeJ 2 

OO 

< m^{E) + e. 

Since e is arbitrary, the proof of Observation 4 is complete. 
Observation 5 If a set E is the countable union of almost disjoint cubes 

^ = U^i Qj> then 

OO 

m^{E) = E \Qj\- 
i=i 

Let Qj denote a cube strictly contained in Qj such that \Qj\ < \Qj \ + 
6/2-^, where e is arbitrary but fixed. Then, for every N, the cubes 
Qi,Q 2 , ■ ■ ■ , Qn are disjoint, hence at a finite distance from one another, 
and repeated applications of Observation 4 imply 

( N \ N N 

=El'5.i>E(l^^|-^/2^')• 

J=l / i=i i=i 

Since U^i Qj C E, we conclude that for every integer N, 

N 

m^{E) > E \Qj\ - e- 

j=i 

In the limit as N tends to infinity we deduce Xljli \ Qj\ — '^*{E) + e 
for every e > 0, hence \Qj\ — 'm'^,{E). Therefore, combined with 

Observation 2, our result proves that we have equality. 

This last property shows that if a set can be decomposed into almost 
disjoint cubes, its exterior measure equals the sum of the volumes of the 
cubes. In particular, by Theorem 1.4 we see that the exterior measure of 
an open set equals the sum of the volumes of the cubes in a decomposi- 
tion, and this coincides with our initial guess. Moreover, this also yields 
a proof that the sum is independent of the decomposition. 


16 


Chapter 1. MEASURE THEORY 


One can see from this that the volumes of simple sets that are cal- 
culated by elementary calculus agree with their exterior measure. This 
assertion can be proved most easily once we have developed the requisite 
tools in integration theory. (See Chapter 2.) In particular, we can then 
verify that the exterior measure of a ball (either open or closed) equals 
its volume. 

Despite observations 4 and 5, one cannot conclude in general that if 
El U iil 2 is a disjoint union of subsets of then 

(3) m^:{Ei\J E 2 ) = m^:{Ei) + m^,{E2). 

In fact (3) holds when the sets in question are not highly irregular or 
“pathological” but are measurable in the sense described below. 

3 Measurable sets and the Lebesgue measure 

The notion of measurability isolates a collection of subsets in for 
which the exterior measure satisfies all our desired properties, including 
additivity (and in fact countable additivity) for disjoint unions of sets. 

There are a number of different ways of defining measurability, but 
these all turn out to be equivalent. Probably the simplest and most 
intuitive is the following: A subset E of is Lebesgue measurable, 
or simply measurable, if for any e > 0 there exists an open set O with 
E C O and 


m^,{0 — E) < e. 


This should be compared to Observation 3, which holds for all sets E. 

If E is measurable, we define its Lebesgue measure (or measure) 
m{E) by 


m{E) = m^:{E). 

Clearly, the Lebesgue measure inherits all the features contained in Ob- 
servations 1 - 5 of the exterior measure. 

Immediately from the definition, we find: 

Property 1 Every open set in R'* is measurable. 

Our immediate goal now is to gather various further properties of 
measurable sets. In particular, we shall prove that the collection of 
measurable sets behave well under the various operations of set theory: 
countable unions, countable intersections, and complements. 
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Property 2 If = 0, then E is measurable. In partieular, if E is 

a subset of a set of exterior measure 0, then F is measurable. 

By Observation 3 of the exterior measure, for every e > 0 there ex- 
ists an open set O with E C O and m^,{0) < e. Since {O — E) C O, 
monotonicity implies m^,{0 — E) < e, as desired. 

As a consequence of this property, we deduce that the Cantor set C in 
Example 6 is measurable and has measure 0. 

Property 3 A eountable union of measurable sets is measurable. 

Suppose E = U^i where each Ej is measurable. Given e > 0, we 

may choose for each j an open set Oj with E^ C Oj and 
— Ej) < e/2l. Then the union O = U^i i® open, E C O, and 
{O — E) C ®o monotonicity and sub-additivity of the 

exterior measure imply 


mt,{(D — E) < m^,{Oj — Ej) < e. 

j=i 

Property 4 Closed sets are measurable. 

First, we observe that it suffices to prove that compact sets are mea- 
surable. Indeed, any closed set E can be written as the union of compact 
sets, say E = IJ^i P C Bk, where Bk denotes the closed ball of radius k 
centered at the origin; then Property 3 applies. 

So, suppose E is compact (so that in particular m^,{E) < oo), and let 
e > 0. By Observation 3 we can select an open set O with E C O and 
m^,{0) < m^,{E) + e. Since F is closed, the difference (D — F is open, 
and by Theorem 1.4 we may write this difference as a countable union 
of almost disjoint cubes 


0-F=[jQ,. 

i=i 

For a fixed N, the finite union K = U^i Qj i® compact; therefore 
d{K,F) > 0 (we isolate this little fact in a lemma below). Since {K U 
F) C O, Observations 1, 4, and 5 of the exterior measure imply 

m^,{0) > m^,{F) + m^,{K) 

N 

= m^{F) + ^ m^{Qj). 

1=1 
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Hence ^ ^^,{0) — m^,{F) < e, and this also holds in the 

limit as N tends to infinity. Invoking the sub- additivity property of the 
exterior measure finally yields 

OO 

m^,{0 - F) < ^ m^,{Qj) < e, 
i=i 


as desired. 

We digress briefly to complete the above argument by proving the 
following. 

Lemma 3.1 If F is closed, K is compact, and these sets are disjoint, 
then d{F, K) > 0. 

Proof Since F is closed, for each point x & K, there exists da: > 0 so 
that d{x, F) > 35x- Since IJa:gK covers K, and K is compact, we 

may find a subcover, which we denote by U^i If we let 6 = 

min(di, . . . , Sjv), then we must have d(K, F) > S > 0. Indeed, if a: G 
and y F, then for some j we have \xj — x\ < 25 j, and by construction 
\y — Xj\ > 35 j. Therefore 

\y — x\ > \y — Xj \ — \xj — a;| > 35 j — 25 j > 5, 

and the lemma is proved. 

Property 5 The complement of a measurable set is measurable. 

If E is measurable, then for every positive integer n we may choose an 
open set On with E C On and m^,{On — E) < 1/n. The complement Of 
is closed, hence measurable, which implies that the union ^ = Un=l^n 
is also measurable by Property 3. Now we simply note that S C E'^, and 

{E^-S) C (On-E), 

such that 'm^,{E‘^ — S) < 1/n for all n. Therefore, — S') = 0, and 

E^ — S \s measurable by Property 2. Therefore E^ is measurable since 
it is the union of two measurable sets, namely S and {E'^ — S). 

Property 6 A countable intersection of measurable sets is measurable. 

This follows from Properties 3 and 5, since 
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In conclusion, we find that the family of measurable sets is closed under 
the familiar operations of set theory. We point out that we have shown 
more than simply closure with respect to finite unions and intersections: 
we have proved that the collection of measurable sets is closed under 
countable unions and intersections. This passage from finite operations 
to infinite ones is crucial in the context of analysis. We emphasize, how- 
ever, that the operations of uncountable unions or intersections are not 
permissible when dealing with measurable sets! 

Theorem 3.2 If Ei,E 2 ,..., are disjoint measurable sets, and E = 
U^i 

CXD 

= ^m{Ej). 
j=i 

Proof. First, we assume further that each Ej is bounded. Then, for 
each j, by applying the definition of measurability to Ej, we can choose 
a closed subset Ej of Ej with mt,{Ej — Ej) < tj2P For each fixed N , 

the sets Fi, . . . , are compact and disjoint, so that m ~ 

Since U^i C F, we must have 

N N 

m{E) > J2m{Fj) > - e. 

i=i j=i 

Letting N tend to infinity, since e was arbitrary we find that 

OO 

m{E) > m{Ej). 
i=i 

Since the reverse inequality always holds (by sub-additivity in Observa- 
tion 2), this concludes the proof when each Ej is bounded. 

In the general case, we select any sequence of cubes {Qk}^=i that 
increases to in the sense that Qk C Qk+i for all A; > 1 and ur= Qk 
We then let = Qi and Sk = Qk — Qk-i for k >2. If we define 
measurable sets by Ej^k = Ej 0 Sk, then 

F = [^ Fj fc. 

The union above is disjoint and every Ej^k is bounded. Moreover Ej = 
Ej^k, and this union is also disjoint. Putting these facts together. 
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and using what has already been proved, we obtain 

m{E) = Y^m{Ej^k) = Y^'^m{Ej^k) = '^m{Ej), 

j,k j k j 


as claimed. 

With this, the countable additivity of the Lebesgue measure on mea- 
surable sets has been established. This result provides the necessary 
connection between the following: 

• our primitive notion of volume given by the exterior measure, 

• the more refined idea of measurable sets, and 

• the countably infinite operations allowed on these sets. 

We make two definitions to state succinctly some further consequences. 
If Ei,E2, ■ ■ . is a countable collection of subsets of that increases 
to E in the sense that Ek C Ek+i for all k, and E = U^i Ek, then we 
write Ek /’ E. 

Similarly, if illi, £'2, • • • decreases to E in the sense that Ek D Ek+i for 
all k, and E = Ek, we write Ek \ E. 

Corollary 3.3 Suppose Ei,E2,... are measurable subsets o/R'^. 

(i) If Ek /’ E, then m{E) = limw^oo w-(£Af)- 

(ii) If Ek\ E and m{Ek) < 00 for some k, then 

m{E) = lim m{Ej\[). 

N^OO 

Proof For the first part, let Gi = Ei, G2 = E2 — Ei, and in gen- 
eral Gk = Ek — Ek-i for k > 2 . By their construction, the sets Gk are 
measurable, disjoint, and E = (J^i ^k- Hence 

00 N / ^ \ 

m{E) = m{Gk) = lim m{Gk) = lim m ( I J Gj, ) , 

^ ^ A^— ^CxD ^ N^OO \ / 

k=l k=l \k=l / 

and since = Ejq we get the desired limit. 

For the second part, we may clearly assume that m{Ei) < 00. Let 
Gk = Ek — Ek+i for each k, so that 


E, = Eu\jGk 

k=l 
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is a disjoint union of measurable sets. As a result, we find that 


N-l 

m{Ei) = m{E) + lim ^ {m{Ek) - m{Ek^i)) 

k=l 

= m{E) + m{Ei) — lim m[Ei^). 

N^OO 

Hence, since m[Ei) < oo, we see that m{E) = lim^v^cx) and the 

proof is complete. 

The reader should note that the second conclusion may fail without 
the assumption that m{Ek) < oo for some k. This is shown by the simple 
example when E^ = (n, oo) C M, for all n. 

What follows provides an important geometric and analytic insight 
into the nature of measurable sets, in terms of their relation to open and 
closed sets. Its thrust is that, in effect, an arbitrary measurable set can 
be well approximated by the open sets that contain it, and alternatively, 
by the closed sets it contains. 

Theorem 3.4 Suppose E is a measurable subset o/R'^. Then, for every 
e > 0; 

(i) There exists an open set O with E C O and m{0 — E) < e. 

(ii) There exists a elosed set F with F C E and m{E — F) < e. 

(in) If m{E) is finite, there exists a compact set K with K C E and 
m{E — K) < e. 

(iv) If m{E) is finite, there exists a finite union F = U^i Qj closed 
cubes such that 


m{EAF) < e. 

The notation EAF stands for the symmetric difference between the 
sets E and F, defined by EAF = {E — F) U {F — E), which consists of 
those points that belong to only one of the two sets E or F. 

Proof. Part (i) is just the definition of measurability. For the second 
part, we know that E^ is measurable, so there exists an open set O with 
E'^ C O and m{0 — E'^) < e. If we let F = 0‘^, then F is closed, F C E, 
and E — F = O — E^. Hence m{E — F) < e as desired. 

For (iii), we first pick a closed set F so that F C E and m{E — F) < 
e/2. For each n, we let Bn denote the ball centered at the origin of radius 
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n, and define compact sets Kn = F D Bn- Then E — Kn is a sequence 
of measurable sets that decreases to E — F, and since m{E) < oo, we 
conclude that for all large n one has m{E — Kn) < e. 

For the last part, choose a family of closed cubes {Qj}'^^ so that 

CXD CXD 

E C and \Qj\ < m{E) + e/2. 

i=i j=i 

Since m{E) < oo, the series converges and there exists N > such that 

E°lw +1 \Qj\ < e/2- If ^ = U7=1 Qj. then 


m{EAF) = m{E — F) + m{F — E) 


< m IJ Qj 

\j=N+l , 


m 


(jQ,-E 
\j = l / 


OO OO 

< IQjl + YlQjl-ME) 

j=N+l 3 = 1 

< e. 


Invariance properties of Lebesgue measure 

A crucial property of Lebesgue measure in is its translation- invariance, 
which can be stated as follows: if if is a measurable set and h S then 
the set Eh = E + h = {x + h •- X & E} is also measurable, and m{E + 
h) = m{E). With the observation that this holds for the special case 
when if is a cube, one passes to the exterior measure of arbitrary sets 
if, and sees from the definition of m* given in Section 2 that m*(if/i) = 
m*(if). To prove the measurability of Eh under the assumption that E 
is measurable, we note that if O is open, O D E, and m^,{0 — E) < e, 
then Oh is open. Oh A Eh-, and m^,{Oh - Eh) < e. 

In the same way one can prove the relative dilation-invariance of 
Lebesgue measure. Suppose <5 > 0, and denote by 5E the set {6x : 
X G if}. We can then assert that 5E is measurable whenever E is, 
and m{5E) = 5‘^m{E). One can also easily see that Lebesgue mea- 
sure is reflection-invariant. That is, whenever E is measurable, so is 
—E = {—X : X G if} and m{—E) = m{E). 

Other invariance properties of Lebesgue measure are in Exercise 7 
and 8, and Problem 4 of Chapter 2. 
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(T-algebras and Borel sets 

A (j-algebra of sets is a collection of subsets of that is closed under 
countable unions, countable intersections, and complements. 

The collection of all subsets of is of course a a-algebra. A more 
interesting and relevant example consists of all measurable sets in 
which we have just shown also forms a cr-algebra. 

Another cr-algebra, which plays a vital role in analysis, is the Borel 
cj-algebra in R'^, denoted by which by definition is the smallest cr- 
algebra that contains all open sets. Elements of this cr-algebra are called 
Borel sets. 

The definition of the Borel cr-algebra will be meaningful once we have 
defined the term “smallest,” and shown that such a cr-algebra exists and 
is unique. The term “smallest” means that if S is any cr-algebra that 
contains all open sets in R'^, then necessarily B^d C S. Since we observe 
that any intersection (not necessarily countable) of cr-algebras is again a 
cr-algebra, we may define B^d as the intersection of all cr-algebras that 
contain the open sets. This shows the existence and uniqueness of the 
Borel cr-algebra. 

Since open sets are measurable, we conclude that the Borel cr-algebra 
is contained in the cr-algebra of measurable sets. Naturally, we may ask 
if this inclusion is strict: do there exist Lebesgue measurable sets which 
are not Borel sets? The answer is “yes.” (See Exercise 35.) 

From the point of view of the Borel sets, the Lebesgue sets arise as 
the completion of the cr-algebra of Borel sets, that is, by adjoining all 
subsets of Borel sets of measure zero. This is an immediate consequence 
of Corollary 3.5 below. 

Starting with the open and closed sets, which are the simplest Borel 
sets, one could try to list the Borel sets in order of their complexity. Next 
in order would come countable intersections of open sets; such sets are 
called Gs sets. Alternatively, one could consider their complements, the 
countable union of closed sets, called the F^r sets.^ 

Corollary 3.5 A subset E ofW^ is measurable 

(i) if and only if E differs from a Gs by a set of measure zero, 

(ii) if and only if E differs from an E^j by a set of measure zero. 

Proof. Clearly E is measurable whenever it satisfies either (i) or (ii), 
since the E^j, Gs, and sets of measure zero are measurable. 


^The terminology Gs comes from German “Gebiete” and “Durschnitt” ; Fey comes from 
French “ferme” and “somme.” 
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Conversely, if E is measurable, then for each integer n > 1 we may 
select an open set On that contains E, and such that m{On — E) < 1/n. 
Then S = is a Gs that contains E, and {S — E) C {On — E) 

for all n. Therefore m{S — E) < 1/n for all n; hence S — E has exterior 
measure zero, and is therefore measurable. 

For the second implication, we simply apply part (ii) of Theorem 3.4 
with e = 1/n, and take the union of the resulting closed sets. 


Construction of a non-measurable set 

Are all subsets of measurable? In this section, we answer this question 
when d = 1 by constructing a subset of R which is not measurable."^ 
This justifies the conclusion that a satisfactory theory of measure cannot 
encompass all subsets of M. 

The construction of a non-measurable set A? uses the axiom of choice, 
and rests on a simple equivalence relation among real numbers in [0, 1]. 

We write x ^ y whenever x — y rational, and note that this is an 
equivalence relation since the following properties hold: 

• X ^ X for every x € [0, 1] 

• if X ^ y, then y ^ x 

• if X ^ y and y ^ z, then x ^ z. 

Two equivalence classes either are disjoint or coincide, and [0, 1] is the 
disjoint union of all equivalence classes, which we write as 

[0,1] = Ut«. 

06 

Now we construct the set Af by choosing exactly one element Xa from 
each Sa, and setting A? = {xa}- This (seemingly obvious) step requires 
further comment, which we postpone until after the proof of the following 
theorem. 

Theorem 3.6 The set AT is not measurable. 

The proof is by contradiction, so we assume that Af is measurable. Let 
be an enumeration of all the rationals in [—1,1], and consider 
the translates 

Afk = Af +rk. 


^The existence of such a set in R implies the existence of corresponding non-measurable 
subsets of R”^ for each d, as a consequence of Proposition 3.4 in the next chapter. 
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We claim that the sets A4 are disjoint, and 

CXD 

(4) [0,1] C IJ A4 c [-1,2]. 

fc=i 

To see why these sets are disjoint, suppose that the intersection 
A/fe n A4' is non-empty. Then there exist rationals ^ r'^ and a and 
P with Xa + rk = xp + rk'’, hence 

^CK 

Consequently a P and Xa — xp is rational; hence Xa ^ xp, which con- 
tradicts the fact that JV contains only one representative of each equiv- 
alence class. 

The second inclusion is straightforward since each A/fe is contained in 
[—1,2] by construction. Finally, if a; € [0, 1], then x ^ Xa for some a, and 
therefore x — Xa = Vk for some k. Hence x € A/fe, and the first inclusion 
holds. 

Now we may conclude the proof of the theorem. If A/” were measurable, 
then so would be J\fk for all k, and since the union IJ^i A4 is disjoint, 
the inclusions in (4) yield 


CXD 

1 < < 3. 

k=l 

Since A/fe is a translate of A^, we must have m{Mk) = for all k. 

Consequently, 

CXD 

1 < < 3. 

k=l 

This is the desired contradiction, since neither m{N) = 0 nor > 0 

is possible. 

Axiom of choice 

That the construction of the set A/” is possible is based on the following 
general proposition. 

• Suppose if is a set and {Ea} is a collection of non-empty subsets 
of E. (The indexing set of a’s is not assumed to be countable.) 
Then there is a function a Xa (a “choice function” ) such that 
Xa G Ea, for all a. 
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In this general form this assertion is known as the axiom of choice. 
This axiom occurs (at least implicitly) in many proofs in mathematics, 
but because of its seeming intuitive self-evidence, its significance was 
not at first understood. The initial realization of the importance of 
this axiom was in its use to prove a famous assertion of Cantor, the 
well-ordering principle. This proposition (sometimes referred to as 
“transfinite induction”) can be formulated as follows. 

A set E is linearly ordered if there is a binary relation < such that: 

(a) X < X for all x ^ E. 

(b) li x,y a E are distinct, then either x < y or y < x (but not both). 

(c) If X < y and y < z, then x < z. 

We say that a set E can be well-ordered if it can be linearly ordered in 
such a way that every non-empty subset A d E has a smallest element 
in that ordering (that is, an element xq G A such that xq < x for any 
other X G A). 

A simple example of a well-ordered set is lA , the positive integers with 
their usual ordering. The fact that lA is well-ordered is an essential part 
of the usual (finite) induction principle. More generally, the well-ordering 
principle states: 

• Any set E can be well-ordered. 

It is in fact nearly obvious that the well-ordering principle implies the 
axiom of choice: if we well-order E, we can choose Xq, to be the smallest 
element in Ea, and in this way we have constructed the required choice 
function. It is also true, but not as easy to show, that the converse impli- 
cation holds, namely that the axiom of choice implies the well-ordering 
principle. (See Problem 6 for another equivalent formulation of the Ax- 
iom of Choice.) 

We shall follow the common practice of assuming the axiom of choice 
(and hence the validity of the well-ordering principle).® However, we 
should point out that while the axiom of choice seems self-evident the 
well-ordering principle leads quickly to some baffling conclusions: one 
only needs to spend a little time trying to imagine what a well-ordering 
of the reals might look like! 


®It can be proved that in an appropriate formulation of the axioms of set theory, the 
axiom of choice is independent of the other axioms; thus we are free to accept its validity. 
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4 Measurable functions 

With the notion of measurable sets in hand, we now turn our attention 
to the objects that lie at the heart of integration theory: measurable 
functions. 

The starting point is the notion of a characteristic function of a set 
E, which is defined by 


Xe{x) 


1 ii X a E, 
0 ii X ^ E. 


The next step is to pass to the functions that are the building blocks of 
integration theory. For the Riemann integral it is in effect the class of 
step functions, with each given as a finite sum 


N 

(5) f = ^akXR„ 

k=l 

where each Rk is a rectangle, and the ak are constants. 

However, for the Lebesgue integral we need a more general notion, as 
we shall see in the next chapter. A simple function is a finite sum 

N 

(6) f = '^akXE^ 

k = l 

where each Ek is a measurable set of finite measure, and the are 
constants. 


4.1 Definition and basic properties 

We begin by considering only real- valued functions / on M'^, which we 
allow to take on the infinite values -|-oo and — oo, so that f{x) belongs 
to the extended real numbers 

< f{x) < oo. 

We shall say that / is finite-valued if — oo < f{x) < oo for all x. In 
the theory that follows, and the many applications of it, we shall almost 
always find ourselves in situations where a function takes on infinite 
values on at most a set of measure zero. 
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A function / defined on a measurable subset E of is measurable, 
if for all a e R, the set 

/■^([-oo,a)) = {x^E ■. f{x) < a} 


is measurable. To simplify our notation, we shall often denote the set 
{x E : f{x) < a} simply by {/ < a} whenever no confusion is possible. 


First, we note that there are many equivalent definitions of measurable 
functions. For example, we may require instead that the inverse image of 
closed intervals be measurable. Indeed, to prove that / is measurable if 
and only if {x : f{x) < a} = {f < a} is measurable for every a, we note 
that in one direction, one has 


CXD 

{f <a} = [^{/ < a+ l/k}, 

k=l 

and recall that the countable intersection of measurable sets is measur- 
able. For the other direction, we observe that 

OO 

{/ < a} = 1J{/ < a- lA}- 

k=l 

Similarly, / is measurable if and only if {/ > a} (or {/ > a}) is measur- 
able for every a. In the first case this is immediate from our definition 
and the fact that {/ > a} is the complement of {/ < a}, and in the sec- 
ond case this follows from what we have just proved and the fact that 
{f < o.} = {f > aY- A simple consequence is that —/is measurable 
whenever / is measurable. 

In the same way, one can show that if / is finite- valued, then it is 
measurable if and only if the sets {a < f < b} are measurable for every 
a, 6 e R. Similar conclusions hold for whichever combination of strict or 
weak inequalities one chooses. For example, if / is finite- valued, then it 
is measurable if and only if {a < / < 6} for all a, 6 G R. By the same 
arguments one sees the following: 

Property 1 The finite-valued funetion f is measurable if and only if 
f~^{0) is measurable for every open set O, and if and only if f~^{E) is 
measurable for every elosed set E. 

Note that this property also applies to extended- valued functions, if we 
make the additional hypothesis that both /“^(oo) and /“^(— oo) are 
measurable sets. 
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Property 2 If f is continuous on R'*, then f is measurable. If f is mea- 
surable and finite-valued, and $ is continuous, then ^ o f is measurable. 

In fact, $ is continuous, so $“^((— oo, a)) is an open set (D, and hence 
($ o /)“^((— oo, a)) = f~^{0) is measurable. 

It should be noted, however, that in general it is not true that 
/ o $ is measurable whenever / is measurable and $ is continuous. See 
Exercise 35. 

Property 3 Suppose is a sequence of measurable functions. 

Then 

sup fn{x), iuffnix), limsup, /n(a:) and hminf/„(x) 

n ^ n — ^ 

are measurable. 

Proving that sup„ /„ is measurable requires noting that {sup„ /n > «} = 
Un{/" > a}. This also yields the result for inf„ fn{x), since this quantity 
equals - sup„(-/„(x)). 

The result for the limsup and liminf also follows from the two obser- 
vations 

limsup fn{x) = inf{sup /„} and liminf fn{x) = sup{ inf /„}. 

n — >oo ^ rL'>k ^ k n^k 

Property 4 If is a collection of measurable functions, and 

lim fn{x) = f{x), 

n— ^oo 

then f is measurable. 

Since f{x) = limsup^^;,^^ fn{x) = hminf„^oo fn{x), this property is a 
consequence of property 3. 

Property 5 If f and g are measurable, then 

(i) The integer powers /*, fc > 1 are measurable. 

(ii) / + 5 and fg are measurable if both f and g are finite-valued. 

For (i) we simply note that if k is odd, then {/^ > a} = {/ > and 

if k is even and a > 0, then {/^ > a} = {/ > U {/ < — 

For (ii) , we first see that f + g is measurable because 

{f + 9>a}= [j{f > a-r}n{g > r}, 
reQ 
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with Q denoting the rationals. 

Finally, fg is measurable because of the previous results and the fact 
that 

f9= \[{f + gf - if - gf]- 

We shall say that two functions / and g defined on a set E are equal 
almost everywhere, and write 

f{x) = g{x) a.e. x a E, 

if the set {x & E : f{x) ^ g{x)} has measure zero. We sometimes ab- 
breviate this by saying that f = g a.e. More generally, a property or 
statement is said to hold almost everywhere (a.e.) if it is true except on 
a set of measure zero. 

One sees easily that if / is measurable and f = g a.e., then g is measur- 
able. This follows at once from the fact that {/ < a} and {g < a} differ 
by a set of measure zero. Moreover, all the properties above can be re- 
laxed to conditions holding almost everywhere. For instance, if {/nlJJLi 
is a collection of measurable functions, and 

lim fn{x) = f{x) a.e.. 


then / is measurable. 

Note that if / and g are defined almost everywhere on a measurable 
subset E then the functions f + g and fg can only be defined on 

the intersection of the domains of / and g. Since the union of two sets of 
measure zero has again measure zero, / -|- 5 is defined almost everywhere 
on E. We summarize this discussion as follows. 

Property 6 Suppose f is measurable, and f{x) = g{x) for a.e. x. Then 
g is measurable. 

In this light. Property 5 (ii) also holds when / and g are finite- valued 
almost everywhere. 

4.2 Approximation by simple functions or step functions 

The theorems in this section are all of the same nature and provide 
further insight in the structure of measurable functions. We begin by 
approximating pointwise, non-negative measurable functions by simple 
functions. 
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Theorem 4.1 Suppose f is a non-negative measurable function on 
Then there exists an increasing sequence of non-negative simple functions 
that converges pointwise to f, namely, 

Tk{x) < </3fe+i(x) and lim (pk{x) = f{x), for all x. 

k^oo 

Proof. We begin first with a truncation. For > 1, let Qn denote 
the cube centered at the origin and of side length N. Then we define 

( f{x) if X € Qn and f{x) < N, 

Fn{x) = < N ii x & Qn and /(x) > N, 

I 0 otherwise. 

Then, Fn{x) — > /(x) as N tends to infinity for all x. Now, we partition 
the range of Fn, namely [0, N], as follows. For fixed N, M > 1, we define 

Ee^M = |x e Qjv : ^ < Fn{x) < | , for 0 < .^ < NM. 

Then we may form 


Fn,m{x) = 

I 

Each Fn^m is a simple function that satisfies 0 < Fn{x) — Fn^m{x) < 
1/M for all X. If we now choose N = M = 2^ with k>l integral, and 
let ipk = F 2 k 2 k, then we see that 0 < Fm{x) — </^fe(x) < 1/2^ for all x, 
{ipk} is increasing, and this sequence satisfies all the desired properties. 

Note that the result holds for non- negative functions that are extended- 
valued, if the limit -|-oo is allowed. We now drop the assumption that / 
is non-negative, and also allow the extended limit — oo. 

Theorem 4.2 Suppose f is measurable on Then there exists a se- 
quence of simple functions that satisfies 

\Tk{x)\ < \(pk+i{x)\ and lim (pk{x) = /(x), for all x. 

k^oo 

In particular, we have \q:>k{x)\ < |/(x)| for all x and k. 

Proof We use the following decomposition of the function /: /(x) = 
f^{x) - f~{x), where 


/+(x) = max(/(x),0) and / (x) = max(-/(x), 0). 
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Since both /+ and / are non-negative, the previous theorem yields 
two increasing sequences of non-negative simple functions 
and which converge pointwise to and f~, respectively. 

Then, if we let 

Mx) = ‘Pk\x) - 

we see that </5fc(x) converges to f{x) for all x. Finally, the sequence {|</5fc|} 

is increasing because the definition of f^,f~ and the properties of 
( 2 ) 

and (f f. imply that 

\Mx)\ = (pi^\x) + ip^i^\x). 


We may now go one step further, and approximate by step functions. 
Here, in general, the convergence may hold only almost everywhere. 

Theorem 4.3 Suppose f is measurable on M'^. Then there exists a se- 
quence of step functions that converges pointwise to f{x) for 

almost every x. 

Proof By the previous result, it suffices to show that if FI is a 
measurable set with finite measure, then f = Xe can be approximated 
by step functions. To this end, we recall part (iv) of Theorem 3.4, 
which states that for every e there exist cubes Qi, ■ ■ ■ ,Qn such that 
m{EA Uj,=i Qj) — c- By considering the grid formed by extending the 
sides of these cubes, we see that there exist almost disjoint rectangles 
i?i, . . . , Rm such that U^i Qj = ^j- taking rectangles Rj con- 

tained in Rj, and slightly smaller in size, we find a collection of disjoint 
rectangles that satisfy m{EA[J^^Rj) < 2e. Therefore 

M 

fix) = ^XRi{x), 

i=i 

except possibly on a set of measure < 2e. Consequently, for every k > 1, 
there exists a step function tfkix) such that if 

Ek = {x: f{x) + ifkix)}, 

then m{Ek) < 2~^. If we let Ek = U^a+i ^ ~ nA=i then 

m{E) = 0 since m{Fx) < 2~^ , and ifkix) — > f{x) for all x in the com- 
plement of E, which is the desired result. 
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4.3 Littlewood’s three principles 

Although the notions of measurable sets and measurable functions rep- 
resent new tools, we should not overlook their relation to the older con- 
cepts they replaced. Littlewood aptly summarized these connections in 
the form of three principles that provide a useful intuitive guide in the 
initial study of the theory. 

(i) Every set is nearly a finite union of intervals. 

(ii) Every function is nearly continuous. 

(iii) Every convergent sequence is nearly uniformly convergent. 

The sets and functions referred to above are of course assumed to 
be measurable. The catch is in the word “nearly,” which has to be 
understood appropriately in each context. A precise version of the first 
principle appears in part (iv) of Theorem 3.4. An exact formulation of 
the third principle is given in the following important result. 

Theorem 4.4 (Egorov) Suppose {fk}’^=i is a sequence of measurable 
functions defined on a measurable set E with m{E) < oo, and assume 
that fk^f a.e on E. Given e > 0, we can find a closed set C E 
such that m{E — A^) < e and fk~^f uniformly on A^. 

Proof We may assume without loss of generality that fk{x) — > f{x) 
for every x (z E. For each pair of non-negative integers n and k, let 


E'^ = {x e E : \fj{x) - f{x)\ < 1/n, for all j > k}. 


Now fix n and note that EJt C and EJt /' E as k tends to infinity. 

By Corollary 3.3, we find that there exists kn such that m{E — Ef:^) < 
1/2™. By construction, we then have 

\fj{x) — f{x)\ < 1/n whenever j > kn and x € E]^^. 

We choose N so that Y1^=n 



n>N 


We first observe that 


OO 


m{E -A,)<Y, m{E - E^J < e/2. 


n=N 
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Next, if > 0, we choose n> N such that 1/n < 5, and note that x € 
Ae implies x G We see therefore that \fj{x) — f{x)\ < 5 whenever 
j > kn- Hence fk converges uniformly to f on A^. 

Finally, using Theorem 3.4 choose a closed subset A^ C A^ with m{A^ — 
Af:) < e/2. As a result, we have m{E — < e and the theorem is 

proved. 

The next theorem attests to the validity of the second of Littlewood’s 
principle. 

Theorem 4.5 (Lusin) Suppose f is measurable and finite valued on E 
with E of finite measure. Then for every e > 0 there exists a elosed set 
Fe, with 


Fe C F, and m{E — Ff) < e 
and such that flp^ is continuous. 

By /|_Fe we mean the restriction of / to the set F^. The conclusion of 
the theorem states that if / is viewed as a function defined only on F^, 
then / is continuous. However, the theorem does not make the stronger 
assertion that the function / defined on E is continuous at the points of 
F,. 

Proof. Let /„ be a sequence of step functions so that fn^f a.e. 
Then we may find sets F„ so that m{En) < 1/2” and fn is continuous 
outside En- By Egorov’s theorem, we may find a set Ae /3 on which 
fn^f uniformly and m{E — < e/3. Then we consider 

F' = A ,/3 - U 

n>N 


for N so large that X]n>Af ^ < e/3. Now for every n> N the function 
fn is continuous on F'; thus / (being the uniform limit of {fn}) is also 
continuous on F' . To finish the proof, we merely need to approximate 
the set F' by a closed set F^ c F' such that m(F' — F^) < e/3. 


5* The Brunn-Minkowski inequality 

Since addition and multiplication by scalars are basic features of vector 
spaces, it is not surprising that properties of these operations arise in a 
fundamental way in the theory of Lebesgue measure on We have al- 
ready discussed in this connection the translation-invariance and relative 
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dilation-invariance of Lebesgue measure. Here we come to the study of 
the sum of two measurable sets A and B, defined as 

A + B = {x : X = x' + x” with x' € A and x" G B}. 

This notion is of importance in a number of questions, in particular in 
the theory of convex sets; we shall apply it to the isoperimetric problem 
in Chapter 3. 

In this regard the first (admittedly vague) question we can pose is 
whether one can establish any general estimate for the measure of A + B 
in terms of the measures of A and B (assuming that these three sets 
are measurable). We can see easily that it is not possible to obtain an 
upper bound for m{A + B) in terms of m{A) and m{B). Indeed, simple 
examples show that we may have m{A) = m{B) = 0 while m{A + B) > 
0. (See Exercise 20.) 

In the converse direction one might ask for a general estimate of the 
form 


m{A -I- B)°‘ > Ca {m{A)°‘ + m{B )°‘) , 

where a is a positive number and the constant Ca is independent of A 
and B. Clearly, the best one can hope for is Cq, = 1. The role of the 
exponent a can be understood by considering convex sets. Such sets 
A are defined by the property that whenever x and y are in A then 
the line segment joining them, {xt + j/(I — t) : 0 < t < 1}, also belongs 
to A. If we recall the definition XA = {Ax, x G A} for A > 0, we note 
that whenever A is convex, then A -|- AA = (1 -|- A) A. However, m((I -|- 
A)A) = (1 + A)‘*m(A), and thus the presumed inequality can hold only 
if (1 + A)‘'“ > 1 + A"^", for all A > 0. Now 

(7) (a -|- by > a'^ + b'^ if 7 > 1 and a, 6 > 0, 

while the reverse inequality holds if 0 < 7 < 1. (See Exercise 38.) This 
yields a> l/d. Moreover, (7) shows that the inequality with the expo- 
nent 1/d implies the corresponding inequality with a> 1/d, and so we 
are naturally led to the inequality 

(8) m(A + H)i/‘^ > m(A)i/‘^ + m(H)i/‘^. 

Before proceeding with the proof of (8), we need to mention a technical 
impediment that arises. While we may assume that A and B are mea- 
surable, it does not necessarily follow that then A -|- H is measurable. 
(See Exercise 13 in the next chapter.) However it is easily seen that this 
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difficulty does not occur when, for example, A and B are closed sets, or 
when one of them is open. (See Exercise 19.) 

With the above considerations in mind we can state the main result. 

Theorem 5.1 Suppose A and B are measurable sets in and their 
sum A + B is also measurable. Then the inequality (8) holds. 

Let us first check (8) when A and B are rectangles with side lengths 
and {bj}j^i, respectively. Then (8) becomes 

/ d \ / d \ / d \ 


which by homogeneity we can reduce to the special case where Oj + 
bj = 1 for each j. In fact, notice that if we replace aj,bj by Xjaj,Xjbj, 
with Xj > 0, then both sides of (9) are multiplied by (A 1 A 2 ■ ■ ■ 

We then need only choose Xj = {oj + bj)~^. With this reduction, the 
inequality (9) is an immediate consequence of the arithmetic-geometric 
inequality (Exercise 39) 


d / d \ i/d 

^ i=i Vj=i / 


for all Xj > 0: 


we add the two inequalities that result when we set Xj = Oj and Xj = bj , 
respectively. 

We next turn to the case when each A and B are the union of finitely 
many rectangles whose interiors are disjoint. We shall prove (8) in this 
case by induction on the total number of rectangles in A and B. We 
denote this number by n. Here it is important to note that the desired 
inequality is unchanged when we translate A and B independently. In 
fact, replacing Ahy A + h and B by B + h' replaces A+B by A + B + 
h+ h', and thus the corresponding measures remain the same. We now 
choose a pair of disjoint rectangles i?i and R2 in the collection making up 
A, and we note that they can be separated by a coordinate hyperplane. 
Thus we may assume that for some j , after translation by an appropriate 
h, Ri lies in A_ = AO {xj < 0}, and i?2 in = A n {0 < Xj}. Observe 
also that both Aj. and contain at least one less rectangle than A does, 
and A = U 

We next translate B so that = 50 {xj < 0} and 5+ = 50 {xj > 
0} satisfy 

m{B±) m{A±) 

m{B) m{A) 
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However, A + H D (A+ + 73+) U (A_ + B-), and the union on the right 
is essentially disjoint, since the two parts lie in different half-spaces. 
Moreover, the total number of rectangles in either A-|_ and 73+ , or A_ 
and 73- is also less than n. Thus the induction hypothesis applies and 


m{A + 73) > m(A+ -|- 73+) -I- m(A_ -|- 73_) 







which gives the desired inequality (8) when A and 73 are both finite 
unions of rectangles with disjoint interiors. 

Next, this quickly implies the result when A and 73 are open sets of 
finite measure. Indeed, by Theorem 1.4, for any e > 0 we can find unions 
of almost disjoint rectangles A^ and B^, such that A^ C A, 73^ C 73, with 
m(A) < m{Af) + e and m{B) < m{Be) + e. Since A -|- 73 D A^ -|- 73^, the 
inequality (8) for A^ and 73^ and a passage to a limit gives the desired 
result. From this, we can pass to the case where A and 73 are arbitrary 
compact sets, by noting first that A -|- 73 is then compact, and that if 
we define A*^ = {x : d{x, A) < e}, then A*^ are open, and A'^ \ A as e — > 
0. With similar definitions for B^ and {A+ By, we observe also that 
A -I- 73 C A'^ -I- 73”^ C (A -I- 73)^*^. Hence, letting e — > 0, we see that (8) for 
A”^ and B^ implies the desired result for A and 73. The general case, 
in which we assume that A, 73, and A -|- 73 are measurable, then follows 
by approximating A and 73 from inside by compact sets, as in (iii) of 
Theorem 3.4. 


6 Exercises 

1 . Prove that the Cantor set C constructed in the text is totally disconnected and 
perfect. In other words, given two distinct points x,y £C, there is a point z (f: C 
that lies between x and y, and yet C has no isolated points. 

[Hint: If x,y € C and \x — y\ > 1 / 3 *^, then x and y belong to two different intervals 
in Cfc. Also, given any x £C there is an end-point y^ of some interval in Ck that 
satisfies x A Vk and \x — yk\ < 1 / 3 *.] 


2. The Cantor set C can also be described in terms of ternary expansions. 


38 


Chapter 1. MEASURE THEORY 


(a) Every number in [0, 1] has a ternary expansion 

OO 

X = where Uk = 0, 1, or 2. 

fe=i 

Note that this decomposition is not unique since, for example, 1/3 = 2/3*^. 

Prove that a: C C if and only if x has a representation as above where every 
at is either 0 or 2. 

(b) The Cantor-Lebesgue function is defined on C by 

OO , 

F{x) = ^ ^ if a; = afe3“'‘, where bk = ak/2. 

k=l 

In this definition, we choose the expansion of x in which a* = 0 or 2. 

Show that F is well defined and continuous on C, and moreover F{0) = 0 as 
well as F{1) = 1. 

(c) Prove that E : C ^ [0, 1] is surjective, that is, for every y G [0, 1] there exists 
X € C such that F{x) = y. 

(d) One can also extend E to be a continuous function on [0, 1] as follows. Note 
that if (a, b) is an open interval of the complement of C, then E(a) = E(fe). 
Hence we may define E to have the constant value E(a) in that interval. 

A geometrical construction of E is described in Chapter 3. 

3. Cantor sets of constant dissection. Consider the unit interval [0, 1], and 
let ^ be a fixed real number with 0 < ^ < 1 (the case ^ = 1/3 corresponds to the 
Cantor set C in the text). 

In stage 1 of the construction, remove the centrally situated open interval in 
[0, 1] of length In stage 2, remove two central intervals each of relative length 
one in each of the remaining intervals after stage 1, and so on. 

Let denote the set which remains after applying the above procedure indefi- 
nitely.® 

(a) Prove that the complement of Cj in [0, 1] is the union of open intervals of 
total length equal to 1. 

(b) Show directly that m,(Q) = 0. 

[Hint: After the A:*® stage, show that the remaining set has total length = (1 — 

4. Cantor-like sets. Construct a closed set C so that at the A:*® stage of the 
construction one removes 2*^“^ centrally situated open intervals each of length £*,, 
with 

h + 2£2 + --- + 2'^-^ik < 1. 


®The set we call is sometimes denoted by C 1-5 . 
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(a) If £j are chosen small enough, then 2* ^£k <1- In this case, show 

that m{C) > 0, and in fact, m{C) = 1 — 

(b) Show that if a; £ C, then there exists a sequence of points {a;„}55Xi such 
that Xn ^ C, yet Xn ^ x and Xn & In, where In is a sub-interval in the 
complement of C with \In\ — > 0. 

(c) Prove as a consequence that C is perfect, and contains no open interval. 

(d) Show also that C is uncountable. 


5. Suppose i? is a given set, and On is the open set: 

On = {x ■. d{x,E) < 1/n}. 

Show: 

(a) If E is compact, then m{E) = lim„^<3o m(On)- 

(b) However, the conclusion in (a) may be false for E closed and unbounded; or 
E open and bounded. 


6. Using translations and dilations, prove the following: Let H be a ball in of 

radius r. Then m{B) = where Vd = and Bi is the unit ball, Bi — 

{x G R'* : |a:| < 1}. 

A calculation of the constant Vd is postponed until Exercise 14 in the next 
chapter. 

7. If (5 = (5i, . . . , Sd) is a d-tuple of positive numbers Si > 0, and E is a, subset of 
R'*, we define SE by 


5E = {(di*!, . . . , SdXd) : where {xi, ... ,Xd) £ E}. 


Prove that SE is measurable whenever E is measurable, and 

m(SE) = Si---Sdm{E). 


8. Suppose L is a linear transformation of R"*. Show that if if is a measurable 
subset of R"^, then so is L{E), by proceeding as follows: 

(a) Note that if E is compact, so is L{E). Hence if E is an set, so is L{E). 

(b) Because L automatically satisfies the inequality 

\L{x) — L{x)\ < M\x — x'\ 

for some M, we can see that L maps any cube of side length £ into a 
cube of side length CdM£, with Cd = 2-\/d. Now if m{E) = 0, there is a 
collection of cubes {Qj} such that E Qj, and rn{Qj) < e. Thus 
mt.{L(E)) < c'e, and hence m{L{E)) — 0. Finally, use Corollary 3.5. 
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One can show that m{L{E)) = \ detL| m{E); see Problem 4 in the next chapter. 

9. Give an example of an open set O with the following property: the boundary 
of the closure of O has positive Lebesgue measure. 

[Hint: Consider the set obtained by taking the union of open intervals which are 
deleted at the odd steps in the construction of a Cantor-like set.) 

10 . This exercise provides a construction of a decreasing sequence of positive 
continuons fnnctions on the interval [0,1], whose pointwise limit is not Riemann 
integrable. 

Let C denote a Cantor-like set obtained from the construction detailed in Exer- 
cise 4, so that in particular m{C) > 0. Let Fi denote a piecewise-linear and contin- 
uous function on [0, 1], with Li = 1 in the complement of the first interval removed 
in the construction of C, Ei = 0 at the center of this interval, and 0 < Ei (x) < 1 for 
all X. Similarly, construct E 2 = 1 in the complement of the intervals in stage two of 
the construction of C, with Elj = 0 at the center of these intervals, and 0 < E 2 < 1- 
Continuing this way, let /n = Ti • E 2 • • • Eh (see Figure 5). 

Fi 


C ^ ) 


F2 



€ ^ 


Figure 5. Construction of {Fh} in Exercise 10 



Prove the following: 

(a) For all n > 1 and all x € [0, 1], one has 0 </„(*)< 1 and fn{x) > fn+i{x). 
Therefore, fn{x) converges to a limit as n ^ cx3 which we denote by f{x). 

(b) The function / is discontinuous at every point of C. 

[Hint: Note that f{x) = 1 if 2 ; G C, and find a sequence of points {Xn} so 
that Xn ^ X and f{xn) = 0.] 

Now J fn{x) dx is decreasing, hence J fn converges. However, a bounded func- 
tion is Riemann integrable if and only if its set of discontinuities has measure zero. 
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(The proof of this fact, which is given in the Appendix of Book I, is outlined in 
Problem 4.) Since / is discontinuous on a set of positive measure, we find that / 
is not Riemann integrable. 

11. Let A be the subset of [0, 1] which consists of all numbers which do not have 
the digit 4 appearing in their decimal expansion. Find m(A). 

12. Theorem 1.3 states that every open set in R is the disjoint union of open 
intervals. The analogue in R”^, d > 2, is generally false. Prove the following: 

(a) An open disc in R^ is not the disjoint union of open rectangles. 

[Hint: What happens to the boundary of any of these rectangles?] 

(b) An open connected set is the disjoint union of open rectangles if and only 
if H is itself an open rectangle. 


13. The following deals with Gs and Fa sets. 

(a) Show that a closed set is a Gs and an open set an Fa. 

[Hint: If F is closed, consider On = {x : d{x, F) < 1/n}.] 

(b) Give an example of an Fa which is not a Gs- 

[Hint: This is more difficult; let T be a denumerable set that is dense.] 

(c) Give an example of a Borel set which is not a Gs nor an Fa. 

14. The purpose of this exercise is to show that covering by a finite number of 
intervals will not suffice in the definition of the outer measure m,. 

The outer Jordan content J*{E) of a set i? in R is defined by 

N 

ME) = inf 

i=i 

where the inf is taken over every finite covering E C U^i 4; , by intervals Ij. 

(a) Prove that J*(E) = J*{E) for every set E (here E denotes the closure of 
E). 

(b) Exhibit a countable subset E C [0, 1] such that J*{E) — 1 while mt,{E) = 0. 

15. At the start of the theory, one might define the outer measure by taking 
coverings by rectangles instead of cubes. More precisely, we define 


m^(£;) = inf ^ [i?j[. 
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where the inf is now taken over all countable coverings E C U^i by (closed) 
rectangles. 

Show that this approach gives rise to the same theory of measure developed in 
the text, by proving that = m^{E) for every subset E of R"*. 

[Hint: Use Lemma 1.1.] 

16. The Borel-Cantelli lemma. Suppose is a countable family of 

measuable subsets of R'* and that 


OO 

'^Tn{Ek) < OO. 
k=l 


Let 


E = {x £ R'* : X £ Ek, for infinitely many k} 
= limsup(iffe). 

k — »oo 

(a) Show that E is measurable. 

(b) Prove m{E) = 0. 

[Hint: Write = n:LiU>„i5,.] 


17. Let {/n} be a sequence of measurable functions on [0, 1] with |/n( 2 ;)| < oo for 
a.e X. Show that there exists a sequence Cn of positive real numbers such that 


Cn 


[Hint: Pick Cn such that m({x : |/n(a:)/cn| > l/n}) < 2 ", and apply the Borel- 
Cantelli lemma.] 


18. Prove the following assertion: Every measurable function is the limit a.e. of a 
sequence of continuous functions. 

19. Here are some observations regarding the set operation A + B. 

(a) Show that if either A and B is open, then A + B is open. 

(b) Show that if A and B are closed, then A + B is measurable. 

(c) Show, however, that A + B might not be closed even though A and B are 
closed. 

[Hint: For (b) show that A + B is an set.] 

20. Show that there exist closed sets A and B with m{A) = m(B) = 0, but m{A -\- 
B) > 0: 
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(a) In R, let ^ = C (the Cantor set), B = C/2. Note that A + B Z) [0, 1]. 

(b) In R^, observe that if ^4 = I x {0} and B = {0} x I (where I = [0, 1]), then 
A + B = I X I. 


21 . Prove that there is a continuous function that maps a Lebesgue measurable 
set to a non-measurable set. 

[Hint: Consider a non-measurable subset of [0,1], and its inverse image in C by the 
function F in Exercise 2.] 

22 . Let X[o,i] be the characteristic function of [0, 1]. Show that there is no every- 
where continuous function / on R such that 

f{x) = X[o,i](a:) almost everywhere. 


23 . Suppose f{x,y) is a function on R^ that is separately continuous: for each 
fixed variable, / is continuous in the other variable. Prove that / is measurable 
on R^. 

[Hint: Approximate / in the variable x by piecewise-linear functions /„ so that 
fn—>f pointwise.j 

24 . Does there exist an enumeration {rn^Z/Li of the rationale, such that the 
complement of 



in R is non-empty? 

[Hint: Find an enumeration where the only rationale outside of a fixed bounded 
interval take the form r„, with n = m? for some integer m.] 

25 . An alternative definition of measurability is as follows: E is measurable if for 
every e > 0 there is a closed set F contained in E with mt{E — E) < e. Show that 
this definition is equivalent with the one given in the text. 

26 . Suppose A Z E (Z B, where A and B are measurable sets of finite measure. 
Prove that if m{A) = m{B), then E is measurable. 

27 . Suppose El and E 2 are a pair of compact sets in R"^ with Ei Z E 2 , and let 
a = m{Ei) and b = m(E 2 ). Prove that for any c with a < c < b, there is a compact 
set E with Ei Z E Z E 2 and m{E) = c. 

[Hint: As an example, if d = 1 and E is a measurable subset of [0, 1], consider 
m(E n [0, tj) as a function of t.\ 
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28 . Let i? be a subset of R with Tnt{E) > 0. Prove that for each 0 < a < 1, there 
exists an open interval I so that 

mt{E n 7) > amf{I). 

Loosely speaking, this estimate shows that E contains almost a whole interval. 
[Hint: Choose an open set O that contains E, and such that mt{E) > amt{0). 
Write O as the countable union of disjoint open intervals, and show that one of 
these intervals must satisfy the desired property.] 

29 . Suppose 75 is a measurable subset of R with m{E) > 0. Prove that the 
difference set of E, which is defined by 

{z: C R : z = x — y tor some x,y € E}, 

contains an open interval centered at the origin. 

If E contains an interval, then the conclusion is straightforward. In general, one 
may rely on Exercise 28. 

[Hint: Indeed, by Exercise 28, there exists an open interval 7 so that m{E n 7) > 
(9/10) m(I). If we denote 75 n 7 by Eo, and suppose that the difference set of 75 q 
does not contain an open interval around the origin, then for arbitrarily small a the 
sets Eo, and 75o + a are disjoint. From the fact that (Eq U {Eq + a)) C (7 U (7 + a)) 
we get a contradiction, since the left-hand side has measure 2m{Eo), while the 
right-hand side has measure only slightly larger than m{I).] 

A more general formulation of this result is as follows. 

30. If 75 and F are measurable, and m{E) > 0, m{F) > 0, prove that 

75-1-7^ = {x-l-i/:a;G 75, x £ F} 


contains an interval. 

31 . The result in Exercise 29 provides an alternate proof of the non-measurability 
of the set A/” studied in the text. In fact, we may also prove the non-measurability 
of a set in R that is very closely related to the set A/”. 

Given two real numbers x and y, we shall write as before that x ~ y whenever 
the difference x — y is rational. Let A/”* denote a set that consists of one element in 
each equivalence class of ~. Prove that A/”* is non- measurable by using the result 
in Exercise 29. 

[Hint : If M* is measurable, then so are its translates A/”/ = N* + , where {r„ } /Lj 

is an enumeration of Q. How does this imply that m{M*) > 0? Can the difference 
set of A/”* contain an open interval centered at the origin?) 

32 . Let N denote the non-measurable subset of 7 = [0, 1] constructed at the end 
of Section 3. 

(a) Prove that if 75 is a measurable subset of A/”, then m(75) = 0. 
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(b) If G is a subset of R with mt(G) > 0, prove that a subset of G is non- 
measurable. 

[Hint: For (a) use the translates of E by the rationale.] 

33 . Let A/” denote the non-measurable set constructed in the text. Recall from the 
exercise above that measurable subsets of Af have measure zero. 

Show that the set = I — M satisfies = 1, and conclude that if Ei = 

Af and E 2 = Af‘^, then 


m,{Ei) + m,{E2) yf mt{Ei U E 2 ), 

although El and E 2 are disjoint. 

[Hint: To prove that mt{Af‘^) = 1, argue by contradiction and pick a measurable 
set U such that U G I, Af‘^ C U and mt{U) < 1 — e.] 

34 . Let Cl and C 2 be any two Cantor sets (constructed in Exercise 3). Show that 
there exists a function F : [0, 1] — > [0, 1] with the following properties: 

(i) F is continuous and bijective, 

(ii) F is monotonically increasing, 

(iii) F maps Ci surjectively onto C 2 . 

[Hint: Copy the construction of the standard Cantor-Lebesgue fnnction.] 

35 . Give an example of a measnrable fnnction / and a continuons fnnction 4> so 
that / o 'll is non-measurable. 

[Hint: Let 4> : Ci ^ C 2 as in Exercise 34, with m(Ci) > 0 and m{C 2 ) ~ 0. Let 
N C Cl he non-measurable, and take / = X«(iv)-] 

Use the construction in the hint to show that there exists a Lebesgue measurable 
set that is not a Borel set. 

36 . This exercise provides an example of a measurable function / on [0, 1] such 
that every function g equivalent to / (in the sense that / and g differ only on a 
set of measure zero) is discontinuous at every point. 

(a) Construct a measurable set E C [0, 1] such that for any non-empty open 
snb-interval I in [0, 1], both sets E D I and E‘^ D I have positive measure. 

(b) Show that / = xe has the property that whenever g{x) = fix) a.e x, then 
g must be discontinuous at every point in [0, 1]. 

[Hint: For the first part, consider a Cantor-like set of positive measure, and add in 
each of the intervals that are omitted in the first step of its construction, another 
Cantor-like set. Continue this procedure indefinitely.] 

37 . Suppose F is a curve y = fix) in R^, where / is continuons. Show that 

m(r) = 0. 
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[Hint: Cover F by rectangles, using the uniform continuity of /.] 

38. Prove that (a + 6 )^ > a'*' + whenever 7 > 1 and a,b > 0. Also, show that 
the reverse inequality holds when 0 < 7 < 1 . 

[Hint: Integrate the inequality between (a + and from 0 to b.] 

39. Establish the inequality 

( 10 ) > (^xi ■ ■ for all atj > 0 , j = 1 , . . . , d 

by using backward induction as follows: 

(a) The inequality is true whenever d is a power of 2 (d = 2*^, fc > 1). 

(b) If (10) holds for some integer d > 2, then it must hold for d — 1, that is, 

one has (yi H + yd-i)/{d - 1) > {yi ■ ■ ■ for all yj > 0 , with 

j = l,...,d- 1 . 

[Hint: For (a), if A: > 2, write (xi + ■ • ■ + X 2 k)l‘l^ as {A + -B)/2, where A = {xi + 
• • • + X 2 k-i) , and apply the inequality when d — 2. For (b), apply the in- 
equality to Xi = yi,... ,Xd-i = yd-i and Xd = {yi A h yd-i)/{d - 1 ).] 


7 Problems 


1. Given an irrational x, one can show (using the pigeon-hole principle, for exam- 
ple) that there exists infinitely many fractions p/q, with relatively prime integers 
p and q such that 


X — 


P 

<1 


< 


However, prove that the set of those a; £ R such that there exist infinitely many 
fractions p/q, with relatively prime integers p and q such that 




(or < l/q"+'^). 


is a set of measure zero. 

[Hint: Use the Borel-Cantelli lemma.] 

2. Any open set fl can be written as the union of closed cubes, so that U = IJ Qj 
with the following properties 

(i) The Qj’s have disjoint interiors. 

(ii) d{Qj, fF^) « side length of Qj. This means that there are positive constants 
c and C so that c < d{Qj,Q‘^)/£{Qj) < C, where £{Qj) denotes the side 
length of Qj. 
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3. Find an example of a measurable subset C of [0, 1] such that m{C) = 0, yet the 
difference set of C contains a non-trivial interval centered at the origin. Compare 
with the result in Exercise 29. 

[Hint: Pick the Cantor set C = C. For a fixed a £ [—1, 1], consider the line y = 
x + a in the plane, and copy the construction of the Cantor set, but in the cube 
Q = [0, 1] X [0, 1]. First, remove all but four closed cubes of side length 1/3, one at 
each corner of Q\ then, repeat this procedure in each of the remaining cubes (see 
Figure 6). The resulting set is sometimes called a Cantor dust. Use the property 
of nested compact sets to show that the line intersects this Cantor dust.] 




Figure 6. Construction of the Cantor dust 


4. Complete the following outline to prove that a bounded function on an interval 
[a, b] is Riemann integrable if and only if its set of discontinuities has measure zero. 
This argument is given in detail in the appendix to Book I. 

Let / be a bounded function on a compact interval J, and let I (c, r) denote 
the open interval centered at c of radius r > 0. Let osc(/, c, r) = sup \f{x) — f{y)\, 
where the supremum is taken over all a;, y £ Jn I{c,r), and define the oscillation 
of / at c by osc(/, c) = lim^^o osc(/, c, r). Clearly, / is continuous at c £ J if and 
only if osc(/, c) = 0. 

Prove the following assertions: 

(a) For every e > 0, the set of points c in J such that osc(/, c) > e is compact. 

(b) If the set of discontinuities of / has measure 0, then / is Riemann integrable. 

[Hint: Given e > 0 let Ac = {c G J '■ osc(/, c) > e}. Cover Ac by a finite 
number of open intervals whose total length is < e. Select an appropriate 
partition of J and estimate the difference between the upper and lower sums 
of / over this partition.] 
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(c) Conversely, if / is Riemann integrable on J, then its set of discontinuities 
has measure 0. 

[Hint: The set of discontinuities of / is contained in Choose a 

partition P such that U{f, P) — L{f, P) < e/n. Show that the total length 
of the intervals in P whose interior intersect is < e.] 


5. Suppose E is measurable with m{E) < oo, and 

E = EiUE 2, Eir]E2 = ll). 

If m{E) = + m*(i? 2 ), then Ei and E 2 are measurable. 

In particular, if i? C Q, where Q is a finite cube, then E is measurable if and 
only if m(Q) = mt{E) + mt.{Q — E). 

6. * The fact that the axiom of choice and the well-ordering principle are equivalent 
is a consequence of the following considerations. 

One begins by defining a partial ordering on a set i? to be a binary relation < 
on the set E that satisfies: 

(i) X < X for all x £ E. 

(ii) X < y and y < x, then x = y. 

(hi) If X < y and y < z, then x < z. 

If in addition x < y or y < x whenever x,y £ E, then < is a linear ordering of E. 

The axiom of choice and the well-ordering principle are then logically equivalent 
to the Hausdorff maximal principle: 

Every non-empty partially ordered set has a (non-empty) maximal 
linearly ordered subset. 

In other words, if E is partially ordered by <, then E contains a non-empty subset 

E which is linearly ordered by < and such that if E is contained in a set G also 

linearly ordered by <, then F = G. 

An application of the Hausdorff maximal principle to the collection of all well- 
orderings of subsets of E implies the well-ordering principle for E. However, the 
proof that the axiom of choice implies the Hausdorff maximal principle is more 
complicated. 

7. * Consider the curve V = {y = f{x)} in R^, 0 < a; < 1. Assume that / is twice 
continuously differentiable in 0 < a: < 1. Then show that m(r -|- T) > 0 if and only 
if r -I- r contains an open set, if and only if / is not linear. 

8. * Suppose A and B are open sets of finite positive measure. Then we have 
equality in the Brunn-Minkowski inequality (8) if and only if A and B are convex 
and similar, that is, there are a <5 > 0 and an /i € R'* such that 


A^5B-£h. 


Integration Theory 


...amongst the many definitions that have been succes- 
sively proposed for the integral of real- valued functions 
of a real variable, I have retained only those which, in 
my opinion, are indispensable to understand the trans- 
formations undergone by the problem of integration, 
and to capture the relationship between the notion of 
area, so simple in appearance, and certain more com- 
plicated analytical definitions of the integral. 

One might ask if there is sufficient interest to oc- 
cupy oneself with such complications, and if it is not 
better to restrict oneself to the study of functions that 
necessitate only simple definitions.... As we shall see 
in this course, we would then have to renounce the 
possibility of resolving many problems posed long ago, 
and which have simple statements. It is to solve these 
problems, and not for love of complications, that I 
have introduced in this book a definition of the inte- 
gral more general than that of Riemann. 

H. Lebesgue, 1903 


1 The Lebesgue integral: basic properties and conver- 
gence theorems 

The general notion of the Lebesgue integral on will be defined in a 
step-by-step fashion, proceeding successively to increasingly larger fam- 
ilies of functions. At each stage we shall see that the integral satisfies 
elementary properties such as linearity and monotonicity, and we prove 
appropriate convergence theorems that amount to interchanging the in- 
tegral with limits. At the end of the process we shall have achieved a 
general theory of integration that will be decisive in the study of further 
problems. 

We proceed in four stages, by progressively integrating: 

1. Simple functions 

2. Bounded functions supported on a set of finite measure 

3. Non-negative functions 
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4. Integrable functions (the general case). 

We emphasize from the onset that all functions are assumed to be mea- 
surable. At the beginning we also consider only finite- valued functions 
which take on real values. Later we shall also consider extended-valued 
functions, and also complex-valued functions. 


Stage one: simple functions 

Recall from the previous chapter that a simple function 9 ? is a finite sum 

N 

( 1 ) V>{x) = '^akXEdx), 

k=l 


where the Ek are measurable sets of finite measure and the Ok are con- 
stants. A complication that arises from this definition is that a simple 
function can be written in a multitude of ways as such finite linear com- 
binations; for example, 0 = xe — Xe for any measurable set E of finite 
measure. Fortunately, there is an unambiguous choice for the represen- 
tation of a simple function, which is natural and useful in applications. 

The canonical form of ip is the unique decomposition as in (1), where 
the numbers ak are distinct and non-zero, and the sets Ek are disjoint. 

Finding the canonical form of ip is straightforward: since ip can take 
only finitely many distinct and non-zero values, say ci, . . . , cm, we may 
set Ek = {x ■. ip{x) = Ck}, and note that the sets Ek are disjoint. There- 
fore ip = ^kXFk is the desired canonical form of ip. 

If (/? is a simple function with canonical form ip{x) = 'Yl!k=i ^kXFk{x), 
then we define the Lebesgue integral of ip by 


ip{x) dx 


M 

'^Ckm{Fk). 

k=l 


If FI is a measurable subset of with finite measure, then ip{x)xE{x) 
is also a simple function, and we define 


ip{x) dx 


ip{x)xE{x) dx. 


To emphasize the choice of the Lebesgue measure m in the definition of 
the integral, one sometimes writes 
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for the Lebesgue integral of Lp. In fact, as a matter of convenience, we 
shall often write f p{x) dx or simply f ip for the integral of p over 

Proposition 1.1 The integral of simple funetions defined above satisfies 
the following properties: 

(i) Independence of the representation. If p = (^kXEk is any rep- 

resentation of p, then 


N 


p = 


y^^akm{Ek). 


k=l 


(ii) Linearity. If p and if are simple, and o, 5 S R, then 


{ap + bif) = a / p + b / if. 


(hi) Additivity. If E and F are disjoint subsets of with finite mea- 
sure, then 

f 

J EyjF Je Jf 

(iv) Monotonicity. If (f < are simple, then 

(v) Triangle inequality. If p is a simple function, then so is \p\, and 

h 


Proof. The only conclusion that is a little tricky is the first, which 
asserts that the integral of a simple function can be calculated by us- 
ing any of its decompositions as a linear combination of characteristic 
functions. 

Suppose that p = '^kXEkJ where we assume that the sets Ek are 

disjoint, but we do not suppose that the numbers Ofc are distinct and non- 
zero. For each distinct non-zero value a among the {ok} we define E'^ = 
[jEk, where the union is taken over those indices k such that Ok = a. 
Note then that the sets E'^ are disjoint, and m{E'^) = '^m{Ek), where 
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the sum is taken over the same set of /c’s. Then clearly ip = ^clxe'^, 
where the sum is over the distinct non- zero values of {ufe}. Thus 

N 

/ ^ ^ am{E'J = ^ akm{Ek). 

k=l 

Next, suppose p = ^kXEk^ where we no longer assume that the E^ 

are disjoint. Then we can “refine” the decomposition IJi^i by finding 
sets El, El, . . . ,El^ with the property that IJ^i ^k = Uj=i ! the 
sets E* {j = I, ... ,n) are mutually disjoint; and for each k, Ek = \J E* , 
where the union is taken over those E^ that are contained in Ek . (A proof 
of this elementary fact can be found in Exercise 1.) For each j, let now 
a* = with the summation taken over all k such that Ek contains 

E*. Then clearly p = ^*jXE* ■ However, this is a decomposition 

already dealt with above because the E* are disjoint. Thus 

/(/? = ^ ^ akm{E*) = '^akm[Ek), 

EkDE* 

and conclusion (i) is established. 

Conclusion (ii) follows by using any representation of p and ip, and 
the obvious linearity of (i). 

For the additivity over sets, one must note that if E and F are disjoint, 
then 


Xeuf = Xe + Xf, 

and we may use the linearity of the integral to see that p = f^p + 

If^P- 

If ?7 > 0 is a simple function, then its canonical form is everywhere non- 
negative, and therefore f J] > Ohy the definition of the integral. Applying 
this argument to tp — p gives the desired monotonicity property. 

Finally, for the triangle inequality, it suffices to write p in its canonical 
form p = 'Y^k=i^kXEk observe that 

N 

W\ = '^\ak\XEki.x). 
k=l 


Therefore, by the triangle inequality applied to the definition of the in- 
tegral, one sees that 




N 


y~]afcm(Ffc) 


k=l 


< 


N 

E 

fe=l 


ak\m{Ek) 
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Incidentally, it is worthwhile to point out the following easy fact: when- 
ever / and g are a pair of simple functions that agree almost everywhere, 
then J f = f g- The identity of the integrals of two functions that agree 
almost everywhere will continue to hold for the successive definitions of 
the integral that follow. 


Stage two: bounded functions supported on a set of finite 
measure 

The support of a measurable function / is defined to be the set of all 
points where / does not vanish, 

supp(/) = {x : f{x) ^ 0}. 

We shall also say that / is supported on a set E, if f{x) = 0 whenever 
X ^ E. 

Since / is measurable, so is the set supp(/). We shall next be interested 
in those bounded measurable functions that have m(supp(/)) < oo. 

An important result in the previous chapter (Theorem 4.2) states the 
following: if / is a function bounded by M and supported on a set E, then 
there exists a sequence {(fn} of simple functions, with each ifn bounded 
by M and supported on E, and such that 

<fn{x) — > f{x) for all X. 


The key lemma that follows allows us to define the integral for the class 
of bounded functions supported on sets of finite measure. 


Lemma 1.2 Let f be a bounded funetion supported on a set E of finite 
measure. is any sequence of simple functions bounded by M, 

supported on E, and with ifin{x) — > f{x) for a.e. x, then: 


(i) The limit lim 

n— ^oo 


ifn exists. 


(ii) // / = 0 a.e., then the limit lim 

n— ^oo 


(fn equals 0. 


Proof. The assertions of the lemma would be nearly obvious if we 
had that ipn converges to / uniformly on E. Instead, we recall one of 
Littlewood’s principles, which states that the convergence of a sequence 
of measurable functions is “nearly” uniform. The precise statement lying 
behind this principle is Egorov’s theorem, which we proved in Chapter 1, 
and which we apply here. 
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Since the measure of E is finite, given e > 0 Egorov’s theorem guar- 
antees the existence of a (closed) measurable subset of E such that 
m{E — A^) < e, and ipn f uniformly on A^. Therefore, setting = 
f kpn we have that 


-fr), dr, 


< 


< 


< 





tpm{x)\ dx 

■ lfm{x)\ dx+ Wn{x) - 


JE-A, 

ipm{x)\ dx + 2M m{E — A^) 
'^rn{x)\ dx + 2Me. 


(Pm{x)\ dx 


By the uniform convergence, one has, for all x Q A^ and all large n and 
m, the estimate l^Pnix) — (prn{x)\ < e, so we deduce that 

\In — Im\ < rn{E)e + 2Me for all large n and m. 


Since e is arbitrary and m{E) < oo, this proves that {In} is a Cauchy 
sequence and hence converges, as desired. 

For the second part, we note that if / = 0, we may repeat the argument 
above to find that \In\ < m{E)e + Me, which yields limn^cx) In = 0, as 
was to be shown. 


Using Lemma 1.2 we can now turn to the integration of bounded func- 
tions that are supported on sets of finite measure. For such a function / 
we define its Lebesgue integral by 


f{x) dx = lim 


(Pn{x) dx, 


where {</?«} is any sequence of simple functions satisfying: \t^n\ < M, 
each ipn is supported on the support of /, and (pn{x) — > f{x) for a.e. x 
as n tends to infinity. By the previous lemma, we know that this limit 
exists. 

Next, we must first show that / / is independent of the limiting se- 
quence {>fn} used, in order for the integral to be well-defined. There- 
fore, suppose that {fin} is another sequence of simple functions that is 
bounded by M, supported on supp(/), and such that fin{x) — > f{x) for 
a.e. X as n tends to infinity. Then, if r]n = ^Pn — fin-, the sequence {rjn} 
consists of simple functions bounded by 2M, supported on a set of fi- 
nite measure, and such that r?n 0 a.e. as n tends to infinity. We may 
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therefore conclude, by the second part of the lemma, that J" 77^ — > 0 as n 
tends to infinity. Consequently, the two limits 




and 


(which exist by the lemma) are indeed equal. 


If n is a subset of with finite measure, and / is bounded with 
m(supp(/)) < 00, then it is natural to define 



Clearly, if / is itself simple, then J f as defined above coincides with 
the integral of simple functions studied earlier. This extension of the def- 
inition of integration also satisfies all the basic properties of the integral 
of simple functions. 

Proposition 1.3 Suppose f and g are bounded functions supported on 
sets of finite measure. Then the following properties hold. 

(i) Linearity. If a,b ^ M, then 


J {af + bg) = a J f + b J 


9 - 


(ii) Additivity. If E and F are disjoint subsets of then 



(in) Monotonicity. If f < g, then 



(iv) Triangle inequality. \f\ is also bounded, supported on a set of finite 
measure, and 
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All these properties follow by using approximations by simple functions, 
and the properties of the integral of simple functions given in Proposi- 
tion 1.1. 

We are now in a position to prove the first important convergence 
theorem. 

Theorem 1.4 (Bounded convergence theorem) Suppose that {/«} 
is a sequence of measurable functions that are all bounded by M , are 
supported on a set E of finite measure, and fn{x) — > f{x) a.e. x as n ^ 
oo. Then f is measurable, bounded, supported on E for a.e. x, and 

j \fn - /I ^ 0 as OO. 

Consequently, 

j ^ j ^ as n ^ oo. 

Proof. From the assumptions one sees at once that / is bounded by M 
almost everywhere and vanishes outside E, except possibly on a set of 
measure zero. Clearly, the triangle inequality for the integral implies 
that it suffices to prove that J" |/„ — /| — > 0 as n tends to infinity. 

The proof is a reprise of the argument in Lemma 1.2. Given e > 0, we 
may find, by Egorov’s theorem, a measurable subset of E such that 
m{E — Af) < e and fn^f uniformly on A^. Then, we know that for 
all sufficiently large n we have \fn{x) — f{x)\ < e for all a; € A^. Putting 
these facts together yields 


[ \fn{x) - f{x)\dx < [ \fn{x) - f{x)\dx+ f \fn{x)-f{x)\dx 

J J A, Je-Ac 

< em{E) + 2M m{E — A^) 

for all large n. Since e is arbitrary, the proof of the theorem is complete. 


We note that the above convergence theorem is a statement about the 
interchange of an integral and a limit, since its conclusion simply says 


lim 

n— ^OO 


in 



A useful observation that we can make at this point is the following: if 
/ > 0 is bounded and supported on a set of finite measure E and f f = 0, 
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then / = 0 almost everywhere. Indeed, if for each integer A: > 1 we set 
Ek = {x e E : f{x) > l/k}, then the fact that k~^XEk{x) < f{x) implies 

k~^m{Ek) - J f’ 

by monotonicity of the integral. Thus m{Ek) = 0 for all fc, and since 
{x : f{x) > 0} = U^i ^k, we see that / = 0 almost everywhere. 


Return to Riemann integrable functions 


We shall now show that Riemann integrable functions are also Lebesgue 
integrable. When we combine this with the bounded convergence theo- 
rem we have just proved, we see that Lebesgue integration resolves the 
second problem in the Introduction. 


Theorem 1.5 Suppose f is Riemann integrable on the elosed interval 
[a, b ] . Then f is measurable, and 



f{x) dx 



f{x)dx, 


where the integral on the left-hand side is the standard Riemann integral, 
and that on the right-hand side is the Lebesgue integral. 


Proof. By definition, a Riemann integrable function is bounded, say 
\f{x)\ < M, so we need to prove that / is measurable, and then establish 
the equality of integrals. 

Again, by definition of Riemann integr ability,^ we may construct two 
sequences of step functions {ifk} and {'ifk} that satisfy the following 
properties: \ipk{x)\ < M and \'fk{x)\ < M for all x € [a, 5] and A: > 1, 

Ti{x) < ip 2 {x) <■■■</<■■■< 'f 2 {x) < -ifiix), 

and 

rTZ pTZ pTZ 

(2) lim / ipk{x) dx = lim / 'tfk{x)dx = / f{x)dx. 

J[a,b] J[a,b] J[a,b] 

Several observations are in order. First, it follows immediately from their 
definition that for step functions the Riemann and Lebesgue integrals 
agree; therefore 

( 3 ) 

piz pc plZ pc 

/ ipk{x) dx = / ipk{x) dx and / ifkix) dx = / ifk{x)dx 
./ [a , 6] ./ [a , b] J [a, h\ ./ [ct , 6] 


^See also Section 1 of the Appendix in Book 1. 


58 


Chapter 2. INTEGRATION THEORY 


for all fc > 1. Next, if we let 


(p{x) = lim ipkix) and = lim tpkix) 



we have (p < f < '4’ ■ Moreover, both (p and are measurable (being the 
limit of step functions), and the bounded convergence theorem yields 



and 



This together with (2) and (3) yields 



and since '4k ~ Pk 0, we must have — p > 0- By the observation 
following the proof of the bounded convergence theorem, we conclude 
that '4 — p = 0 a.e., and therefore <p = ip = f a.e., which proves that / 
is measurable. Finally, since Pk ^ f almost everywhere, we have (by 
definition) 



and by (2) and (3) we see that f{x) dx = f{x) dx, as desired. 


Stage three: non-negative functions 

We proceed with the integrals of functions that are measurable and non- 
negative but not necessarily bounded. It will be important to allow 
these functions to be extended-valued, that is, these functions may take 
on the value +oo (on a measurable set). We recall in this connection the 
convention that one defines the supremum of a set of positive numbers 
to be +00 if the set is unbounded. 

In the case of such a function / we defi'ne its (extended) Lebesgue 
integral by 
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where the supremum is taken over all measurable functions g such that 
0 < 5 < /, and where g is bounded and supported on a set of finite 
measure. 

With the above definition of the integral, there are only two possible 
cases; the supremum is either finite, or infinite. In the first case, when 
f f{x) dx < oo, we shall say that / is Lebesgue integrable or simply 
integrable. 

Clearly, if E is any measurable subset of M'^, and / > 0, then fxE is 
also positive, and we define 


f{x) dx 


f{x)xE{x) dx. 


Simple examples of functions on that are integrable (or non-integrable) 
are given by 


fa{x) 


|a;| “ if \x\ < 1, 
0 if \x\ > 1. 


Fa(x) = all X € 

^ ^ 1 + |x|“ 

Then fa is integrable exactly when a < d, while Fa is integrable exactly 
when a > d. See the discussion following Corollary 1.10 and also Exer- 
cise 10. 


Proposition 1.6 The integral of non-negative measurable funetions en- 
joys the following properties: 

(i) Linearity. If f,g > 0, and a, b are positive real numbers, then 


j {af + bg) = a J f + b j g. 


(ii) Additivity. If E and F are disjoint subsets of and / > 0, then 


'EUF JE Jf 


(iii) Monotonicity. If0<f<g, then 
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(iv) If g is integrable and 0 < f < g, then f is integrable. 

(y) If f is integrable, then f{x) < oo for almost every x. 

(vi) If J / = 0, then f{x) = 0 for almost every x. 

Proof. Of the first four assertions, only (i) is not an immediate 
consequence of the definitions, and to prove it we argue as follows. We 
take a = b = 1 and note that if < / and if < g, where both (p and ip are 
bounded and supported on sets of finite measure, then p + ip < f + g, 
and tp + Ip is also bounded and supported on a set of finite measure. 
Consequently 

Jf + Js<J(J + s). 

To prove the reverse inequality, suppose rj is bounded and supported on a 
set of finite measure, and rj < f + g. If we define rii{x) = miii{f{x),g{x)) 
and = I? ~ I we note that 

gi< f and 772 < g. 

Moreover both rji , 772 are bounded and supported on sets of finite mea- 
sure. Hence 


/ 


V = 


j {m + m) 



m < 



/ 


g- 


Taking the supremum over 77 yields the required inequality. 

To prove the conclusion (v) we argue as follows. Suppose Ek = {x : 
f{x) > k}, and i?oo = {x : f{x) = 00}. Then 

y / > y XeJ > km{Ek), 

hence m{Ek) — > 0 as fc — > 00. Since Ek \ E^o, Corollary 3.3 in the pre- 
vious chapter implies that m{Eao) = 0. 

The proof of (vi) is the same as the observation following Theorem 1 . 4 . 


We now turn our attention to some important convergence theorems 
for the class of non-negative measurable functions. To motivate the re- 
sults that follow, we ask the following question: Suppose /n > 0 and 
fn{x) — > f{x) for almost every x. Is it true that f fndx ^ J f dx I Un- 
fortunately, the example that follows provides a negative answer to this. 


1. The Lebesgue integral: basic properties and convergence theorems 


61 


and shows that we must change our formulation of the question to obtain 
a positive convergence result. 

Let 


fn{x) 


n if 0 < x < l/n, 
0 otherwise. 


Then fn{x) — > 0 for all x, yet f fn{ x) dx = 1 for all n. In this particular 
example, the limit of the integrals is greater than the integral of the limit 
function. This turns out to be the case in general, as we shall see now. 

Lemma 1.7 (Fatou) Suppose {/«} is a sequence of measurable func- 
tions with fn > 0. //lim^^oo fn{x) = f{x) for a.e. x, then 


[ f < liminf [ /„. 

Proof. Suppose 0 < g < f, where g is bounded and supported on a 
set E of finite measure. If we set gn{x) = mm{g{x), fn{x)), then gn is 
measurable, supported on E, and gn{x) — > g{x) a.e., so by the bounded 
convergence theorem 

By construction, we also have gn < fn, so that f gn ^ f fn, and therefore 


[ g < liminf / fn- 

Taking the supremum over all g yields the desired inequality. 

In particular, we do not exclude the cases J" / = oo, or liminf^^oo fn = 
oo. 


We can now immediately deduce the following series of corollaries. 

Corollary 1.8 Suppose f is a non-negative measurable function, and 
{fn} a, sequence of non-negative measurable functions with fn{x) < f{x) 
and fn{x) — > f{x) for almost every x. Then 

lim f f^= f f. 

n^ooj J 


Proof. Since fn{x) < f{x) a.e x, we necessarily have f fn ^ //for 
all n; hence 


lim sup 

n — >^cxD 


fn < 


/• 
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This inequality combined with Fatou’s lemma proves the desired limit. 

In particular, we can now obtain a basic convergence theorem for the 
class of non-negative measurable functions. Its statement requires the 
following notation. 

In analogy with the symbols /’ and \ used to describe increasing and 
decreasing sequences of sets, we shall write 


/n// 


whenever {/njJJLi is a sequence of measurable functions that satisfies 
fn{x) < fn+i{x) a.e X, all n > 1 and lim fn{x) = f{x) a.e x. 


Similarly, we write fn\ f whenever 
fn{x) > fn-\-i{x) a.e X, all n > 1 and lim fn{x) = f{x) a.e x. 


Corollary 1.9 (Monotone convergence theorem) Suppose {fn} is 
a sequence of non-negative measurable functions with fn /’ /• Then 



The monotone convergence theorem has the following useful conse- 
quence: 

Corollary 1.10 Consider a series ^k{x), where ak{x) > 0 is mea- 

surable for every k > 1. Then 



If YlT=i I o,k{x) dx is finite, then the series Yl^=i^k{x) converges for 

a.e. X. 


Proof Let fn{x) = Yl'k=i ^k{x) and f{x) = ak{x). The func- 

tions fn are measurable, fn{x) < fn+i{x), and fn{x) — > f{x) as n tends 
to infinity. Since 



n 


k=l 
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the monotone convergence theorem implies 


OO n n oo 

/ ak{x) dx = / ai.{x) dx. 
k=i fe=l 

If ^ J ttfe < oo, then the above implies that i® integrable, 

and by our earlier observation, we conclude that Yl^=i^k{x) is finite 
almost everywhere. 

We give two nice illustrations of this last corollary. 

The first consists of another proof of the Borel-Cantelli lemma (see 
Exercise 16, Chapter 1), which says that if Ei,E 2 ,... is a collection 
of measurable subsets with '^m{Ek) < oo, then the set of points that 
belong to infinitely many sets E^ has measure zero. To prove this fact, 
we let 


ak{x) = XeAx), 


and note that a point x belongs to infinitely many sets Ek if and only 
if Yl^=i^k{x) = oo. Our assumption on '^m{Ek) says precisely that 
f o-k(x) dx < oo, and the corollary implies that ^k{x) is finite 

except possibly on a set of measure zero, and thus the Borel-Cantelli 
lemma is proved. 

The second illustration will be useful in our discussion of approxima- 
tions to the identity in Chapter 3. Consider the function 


f{x) = 


|3;|d+i if a; 7^ 0, 

0 otherwise. 


We prove that / is integrable outside any ball, |x| > e, and moreover 

I fix) dx < for some constant C > 0. 

J\x\>e e 


Indeed, if we let = {x € : 2^e < \x\ < 2^+^e}, and define 


^ 1 

9 {x) = '^akix) where akix) = 

k=0 ^ ^ 

then we must have f(x) < g{x), and hence f f < f g- Since the set 
is obtained from ^ = {1 < |x| < 2} by a dilation of factor 2^e, we have 
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by the relative dilation- invariance properties of the Lebesgue measure, 
that m{Ak) = { 2 ^eYm{A). Also by Corollary 1.10, we see that 



fe =0 


(2feg)d 


C 

e 


where C = 2 m{A). Note that the same dilation- invariance property in 
fact shows that 



If dx 

e J\x\>i 


See also the identity (7) below. 


Stage four: general case 

If / is any real- valued measurable function on R'*, we say that / is 
Lebesgue integrable (or just integrable) if the non-negative measur- 
able function |/| is integrable in the sense of the previous section. 

If / is Lebesgue integrable, we give a meaning to its integral as follows. 
First, we may define 

f^{x) = max(/(x),0) and f~{x) = max(-/(a:), 0), 

so that both /+ and f~ are non- negative and — f~ = f ■ Since < 
I/I, both functions /+ and f~ are integrable whenever / is, and we then 
define the Lebesgue integral of / by 



In practice one encounters many decompositions / = /i — /2, where 
/i,/2 are both non- negative integrable functions, and one would expect 
that regardless of the decomposition of /, we always have 

In other words, the definition of the integral should be independent of the 
decomposition / = /i — /2- To see why this is so, suppose / = (?i — 52 
is another decomposition where both gi and 52 are non-negative and 
integrable. Since fi — f2 = §1 — 92 we have /i + 52 = + /2; but both 
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sides of this last identity consist of positive measurable functions, so the 
linearity of the integral in this case yields 



Since all integrals involved are finite, we find the desired result 



In considering the above definitions it is useful to keep in mind the 
following small observations. Both the integrability of /, and the value 
of its integral are unchanged if we modify / arbitrarily on a set of measure 
zero. It is therefore useful to adopt the convention that in the context 
of integration we allow our functions to be undefined on sets of measure 
zero. Moreover, if / is integrable, then by (v) of Proposition 1.6, it is 
finite- valued almost everywhere. Thus, availing ourselves of the above 
convention, we can always add two integrable functions / and g, since 
the ambiguity oi f + g, due to the extended values of each, resides in a 
set of measure zero. Moreover, we note that when speaking of a function 
/, we are, in effect, also speaking about the collection of all functions 
that equal / almost everywhere. 

Simple applications of the definition and the properties proved previ- 
ously yield all the elementary properties of the integral: 

Proposition 1.11 The integral of Lebesgue integrable functions is lin- 
ear, additive, monotonic, and satisfies the triangle inequality. 

We now gather two results which, although instructive in their own 
right, are also needed in the proof of the next theorem. 

Proposition 1.12 Suppose f is integrable on R'*. Then for every e > 0; 

(i) There exists a set of finite measure B (a ball, for example) such 


that 



(ii) There is a 5 > 0 such that 


[ \f\<e 

J E 


whenever m{E) < 5. 
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The last condition is known as absolute continuity. 

Proof. By replacing / with |/| we may assume without loss of gener- 
ality that / > 0. 

For the first part, let Bpf denote the ball of radius N centered at the 
origin, and note that if fN{x) = f{x)xBp,{^)j then /at > 0 is measur- 
able, fN{x) < fN+i{x), and limAr^oo /Ar(a^) = f{x). By the monotone 
convergence theorem, we must have 



In particular, for some large N, 

0 < J ^ ~ J < e, 

and since 1 — xbn = this implies f < e, as we set out to prove. 

For the second part, assuming again that / > 0, we let fN{x) = f{x)xEN 
where 

En = {x-. fix) < N}. 

Once again, /w > 0 is measurable, fN{x) < fN+i{x), and given e > 0 
there exists (by the monotone convergence theorem) an integer > 0 
such that 

We now pick 5 > 0 so that N5 < e/2. If m{E) < 5, then 



= / (/ - / n ) + [ fN 
Je Je 

< J if - fN) + J fN 

< J if - fN) + Nm{E) 


This concludes the proof of the proposition. 

Intuitively, integrable functions should in some sense vanish at infinity 
since their integrals are finite, and the first part of the proposition at- 
taches a precise meaning to this intuition. One should observe, however. 
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that integrability need not guarantee the more naive pointwise vanishing 
as |x| becomes large. See Exercise 6. 

We are now ready to prove a cornerstone of the theory of Lebesgue 
integration, the dominated convergence theorem. It can be viewed as a 
culmination of our efforts, and is a general statement about the interplay 
between limits and integrals. 

Theorem 1.13 Suppose {/«} is a sequence of measurable functions such 
that fn{x) — > f{x) a.e. x, as n tends to infinity. If\fn{x)\ < g{x), where 
g is integrable, then 

y l/n - /I ^ 0 as oo, 

and consequently 

j j ^ as n ^ oo. 


Proof For each V > 0 let = {x : |x| < N, g{x) < N}. Given 
e > 0, we may argue as in the first part of the previous lemma, to see 
that there exists N so that 5 < e. Then the functions fnXEn 
bounded (by N) and supported on a set of finite measure, so that by the 
bounded convergence theorem, we have 



/I < e, 


for all large n. 


Hence, we obtain the estimate 



f\= [ \fn-f\+ [ \fn 

J En E^ 

< [ l/n-/|+2 / g 

J Em J Ej^ 

< e + 2e = 3e 


/I 


for all large n. This proves the theorem. 


Complex-valued functions 

If / is a complex- valued function on M'*, we may write it as 


f{x) = u{x) + iv{x), 
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where u and v are real-valued functions called the real and imaginary 
parts of /, respectively. The function / is measurable if and only if both u 
and V are measurable. We then say that / is Lebesgue integrable if the 
function |/(a:)| = (u(x)^ -|- v(x)^)^^^ (which is non-negative) is Lebesgue 
integrable in the sense defined previously. 

It is clear that 

|w(a;)| < |/(a;)| and |i;(x)| < |/(a;)|. 

Also, if a, 6 > 0, one has (a + 6)^/^ < -|- 6^/^, so that 

\f{x)\ < \u{x)\ + |t;(x)|. 

As a result of these simple inequalities, we deduce that a complex-valued 
function is integrable if and only if both its real and imaginary parts are 
integrable. Then, the Lebesgue integral of / is defined by 


J f{x) dx = J u{x) dx + i J v{x) dx. 

Finally, if i? is a measurable subset of and / is a complex- valued 
measurable function on E, we say that / is Lebesgue integrable on E if 
fXE is integrable on and we define f^f = f fxE- 

The collection of all complex-valued integrable functions on a mea- 
surable subset E forms a vector space over C. Indeed, if / and g 
are integrable, then so is / -|- <?, since the triangle inequality gives |(/ -f 
< 7 )(a^)| < \f{x) \ + \g{x)\, and monotonicity of the integral then yields 

[ \ f + g\< [ I/I + / IffI < oo. 

Je Je Je 

Also, it is clear that if a G C and if / is integrable, then so is af. Finally, 
the integral continues to be linear over C. 


2 The space of integrable functions 

The fact that the integrable functions form a vector space is an impor- 
tant observation about the algebraic properties of such functions. A 
fundamental analytic fact is that this vector space is complete in the 
appropriate norm. 
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For any integrable function / on we define the norm^ of /, 

ll/ll = II/IIli = II/IIlmr'') = [ \fi^)\dx. 

Jr‘‘ 

The collection of all integrable functions with the above norm gives a 
(somewhat imprecise) definition of the space We also note that 

ll/ll = 0 if and only if / = 0 almost everywhere (see Proposition 1.6), 
and this simple property of the norm reflects the practice we have al- 
ready adopted not to distinguish two functions that agree almost every- 
where. With this in mind, we take the precise definition of to be 

the space of equivalence classes of integrable functions, where we define 
two functions to be equivalent if they agree almost everywhere. Often, 
however, it is convenient to retain the (imprecise) terminology that an 
element / G T^(IR‘^) is an integrable function, even though it is only an 
equivalence class of such functions. Note that by the above, the norm 
II /II of an element / G //^(M'^) is well-defined by the choice of any inte- 
grable function in its equivalence class. Moreover, L^(IR‘^) inherits the 
property that it is a vector space. This and other straightforward facts 
are summarized in the following proposition. 

Proposition 2.1 Suppose f and g are two funetions in L^(IR'^). 

(i) lla/llLMM-i) = |a| ll/llLi(R‘i) for all a G C. 

(ii) 11/ + 9 ||li(E<^) < II/IIl 1(E<*) + llfi'llLi(E‘i)- 

(iii) II/IIlme'^) = 0 if and only if f = 0 a.e. 

(iv) d{f,g) = 11/ — g||Li(E<i) defines a metrie on 

In (iv), we mean that d satisfies the following conditions. First, d{f, g) > 
0 for all integrable functions / and g, and d{f, (?) = 0 if and only if f = g 
a.e. Also, d{f,g) = d{g,f), and finally, d satisfies the triangle inequality 

d{f, g) < d{f, h) + d{h, g), for all f,g,he 

A space V with a metric d is said to be complete if for every Cauchy 
sequence {xk} in V (that is, d{xk,xe) — > 0 as k,£^oo) there exists 
X a V such that limfe^oo Xk = x in the sense that 

d{xk,x) — > 0, as A: — > oo. 

Our main goal of completing the space of Riemann integrable functions 
will be attained once we have established the next important theorem. 


^In this chapter the only norm we consider is the L'^-norm, so we often write ||/|| for 
ll/ll^i. Later, we shall have occasion to consider other norms, and then we shall modify 
our notation accordingly. 
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Theorem 2.2 (Riesz-Fischer) The vector space is complete in its 
metric. 

Proof. Suppose {/n} is a Cauchy sequence in the norm, so that \\fn — 
/mil ^ 0 as n, m — > oo. The plan of the proof is to extract a subsequence 
of {/n} that converges to /, both pointwise almost everywhere and in 
the norm. 

Under ideal circumstances we would have that the sequence {fn} con- 
verges almost everywhere to a limit /, and we would then prove that the 
sequence converges to / also in the norm. Unfortunately, almost every- 
where convergence does not hold for general Cauchy sequences (see Exer- 
cise 12). The main point, however, is that if the convergence in the norm 
is rapid enough, then almost everywhere convergence is a consequence, 
and this can be achieved by dealing with an appropriate subsequence of 
the original sequence. 

Indeed, consider a subsequence {fnk}T=i {/«} following 

property: 

||/n.+, -/nJI <2-^ forallfc>l. 

The existence of such a subsequence is guaranteed by the fact that ||/n — 
/mil < e whenever n,m> N{e), so that it suffices to take Uk = N{2~^). 

We now consider the series whose convergence will be seen below, 

OO 

f{x) = fuAx) + '^{fn^+i{x) - fnAx)) 

k=l 


and 

OO 

9{x) = \fn,{x)\ + |/n,.+i(a:) - fnAx)\, 

k=l 

and note that 

/ OO n n OO 

\fnA + Y / l/n. + i-/nj< / | /n J + 2-^= < OO. 

k=l k=l 

So the monotone convergence theorem implies that g is integrable, and 
since |/| < g, hence so is /. In particular, the series defining / converges 
almost everywhere, and since the partial sums of this series are precisely 
the fn^ (by construction of the telescopic series), we find that 


fn^{x)^f{x) a.e. x. 
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To prove that fn^. — > / in as well, we simply observe that \f — fn^ \ < g 
for all k, and apply the dominated convergence theorem to get \\fnk ~ 
/II Li 0 as k tends to infinity. 

Finally, the last step of the proof consists in recalling that {/«} is 
Cauchy. Given e, there exists N such that for all n,m > N we have 
II /n - /mil < e/2. If Uk is chosen so that nu > N, and Wfn^ - /|| < e/2, 
then the triangle inequality implies 

ll/n - /II < ll/n - fnj + \\fn^ “ /II < e 

whenever n > N. Thus {/«} has the limit / in and the proof of the 
theorem is complete. 

Since every sequence that converges in the norm is a Cauchy sequence 
in that norm, the argument in the proof of the theorem yields the fol- 
lowing. 

Corollary 2.3 If {fn}'^=\ converges to f in , then there exists a sub- 
sequence {/rifcl^i sueh that 


fnk{x)^f{x) a.e. X. 

We say that a family Q of integrable functions is dense in if for any 
f & and e > 0, there exists g G Q so that ||/ — gW^i < e. Fortunately 
we are familiar with many families that are dense in L^, and we describe 
some in the theorem that follows. These are useful when one is faced 
with the problem of proving some fact or identity involving integrable 
functions. In this situation a general principle applies: the result is often 
easier to prove for a more restrictive class of functions (like the ones in 
the theorem below) , and then a density (or limiting) argument yields the 
result in general. 

Theorem 2.4 The following families of funetions are dense in T^(IR'^); 

(i) The simple funetions. 

(ii) The step functions. 

(in) The continuous functions of compact support. 

Proof. Let / be an integrable function on First, we may assume 
that / is real-valued, because we may approximate its real and imaginary 
parts independently. If this is the case, we may then write /=/'*' — /“, 
where /■*“, f~ > 0, and it now suffices to prove the theorem when / > 0. 
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For (i), Theorem 4.1 in Chapter 1 guarantees the existence of a se- 
quence of non-negative simple functions that increase to / point- 
wise. By the dominated convergence theorem (or even simply the mono- 
tone convergence theorem) we then have 

11/ - 'T’/cIIli ^0 as oo. 

Thus there are simple functions that are arbitrarily close to / in the 
norm. 

For (ii), we first note that by (i) it suffices to approximate simple 
functions by step functions. Then, we recall that a simple function is 
a finite linear combination of characteristic functions of sets of finite 
measure, so it suffices to show that if E is such a set, then there is a 
step function ip so that Hx^; — is small. However, we now recall 
that this argument was already carried out in the proof of Theorem 4.3, 
Chapter 1. Indeed, there it is shown that there is an almost disjoint 
family of rectangles {Rj} with m{EA U^i ^j) — Thus xe and ip = 
XRj differ at most on a set of measure 2e, and as a result we find 
that llxB - iPWl^ < 2e. 

By (ii) , it suffices to establish (iii) when / is the characteristic function 
of a rectangle. In the one-dimensional case, where / is the characteristic 
function of an interval [a, 6], we may choose a continuous piecewise linear 
function g defined by 


, <._r 1 ifa<j;<6, 

^ ^ |0 ifa;<a — eora:>6-|-e, 

and with g linear on the intervals [a — e, a] and [b,b + €]. Then ||/ — 
<711^1 < 2e. In d dimensions, it suffices to note that the characteristic 
function of a rectangle is the product of characteristic functions of inter- 
vals. Then, the desired continuous function of compact support is simply 
the product of functions like g defined above. 

The results above for (R"^) lead immediately to an extension in which 
can be replaced by any fixed subset E of positive measure. In fact 
if E is such a subset, we can define L^{E) and carry out the arguments 
that are analogous to Better yet, we can proceed by extending 

any function f on E by setting f = f on E and f = 0 on E'^, and defining 
II/I|li(£) = ll/llLqE-i)- The analogues of Proposition 2.1 and Theorem 2.2 
then hold for the space L^{E). 
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Invariance Properties 

If / is a function defined on the translation of / by a vector h 
is the function fh, defined by fh{x) = f{x — h). Here we want to examine 
some basic aspects of translations of integrable functions. 

First, there is the translation-invariance of the integral. One way to 
state this is as follows: if / is an integrable function, then so is fh and 

(4) f f{x -h)dx= I f{x) dx. 

JR'i 

We check this assertion first when / = xe, the characteristic function 
of a measurable set E. Then obviously fh = Xe^ ; where Eh = {x + h : 
X e E}, and thus the assertion follows because m{Eh) = m{E) (see Sec- 
tion 3 in Chapter 1). As a result of linearity, the identity (4) holds for 
all simple functions. Now if / is non-negative and {ipn} is a sequence of 
simple functions that increase pointwise a.e to / (such a sequence exists 
by Theorem 4.1 in the previous chapter), then {{(pn)h} is a sequence of 
simple functions that increase to fh pointwise a.e, and the monotone con- 
vergence theorem implies (4) in this special case. Thus, if / is complex- 
valued and integrable we see that \f{x — h)\dx = \f{x) \ dx, which 

shows that fh G T^(M‘*) and also \\fh\\ = ||/||- From the definitions, we 
then conclude that (4) holds whenever f (L L^. 

Incidentally, using the relative invariance of Lebesgue measure under 
dilations and reflections (Section 3, Chapter 1) one can prove in the same 
way that if f{x) is integrable, so is f{5x), 5 > 0, and /(— x), and 

( 5 ) 

I f{5x)dx= / f{x)dx, while / f{—x)dx= / f{x)dx. 

We digress to record for later use two useful consequences of the above 
invariance properties: 

(i) Suppose that / and g are a pair of measurable functions on so 
that for some fixed x € the function y i— > /(x — y)g{y) is integrable. 
As a consequence, the function y i— > f{y)g{x — y) is then also integrable 
and we have 

( 6 ) / f{x - y)g{y)dy = [ f{y)g{x-y)dy. 

This follows from (4) and (5) on making the change of variables which 
replaces y by x — y, and noting that this change is a combination of a 
translation and a reflection. 
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The integral on the left-hand side is denoted by (/ * g){x) and is de- 
fined as the convolution of / and g. Thus (6) asserts the commutativity 
of the convolution product. 


(ii) Using (5) one has that for all e > 0 


(7) 


[ 

dx 

whenever 

a > d, 

J\x\>e 

' |a:|>l 




and 






(8) 


f 

dx 

whenever 

a < d. 

J\x\<e 

'|s|<l 

X “ 




It can also be seen that the integrals and (respec- 

tively, when a > d and a < d) are finite by the argument that appears 
after Corollary 1.10. 

Translations and continuity 

We shall next examine how continuity properties of / are related to the 
way the translations fh vary with h. Note that for any given a; G the 
statement that fh{x) — > f{x) as — > 0 is the same as the continuity of 
/ at the point x. 

However, a general / which is integrable may be discontinuous at ev- 
ery X, even when corrected on a set of measure zero; see Exercise 15. 
Nevertheless, there is an overall continuity that an arbitrary / G 
enjoys, one that holds in the norm. 

Proposition 2.5 Suppose f G Z/^(R‘^). Then 

Wfh - /IIli ^0 as h^O. 

The proof is a simple consequence of the approximation of integrable 
functions by continuous functions of compact support as given in The- 
orem 2.4. In fact for any e > 0, we can find such a function g so that 
11/ - g\\ < e- Now 

fh - f = {gh - g) + {fh - gh) - (/ - g)- 

However, ||//i — 5/i|| = 11/ “ 5|| < while since g is continuous and has 
compact support we have that clearly 

\\9h-g\\=j \g{x - h) - g{x)\dx ^ Q as h ^ 0. 

So if \h\ < 6, where 6 is sufficiently small, then \\gh — g\\ < e, and as a 
result Wfh — f\\ < 3e, whenever \h\ <5. 
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3 Fubini’s theorem 

In elementary calculus integrals of continuous functions of several vari- 
ables are often calculated by iterating one-dimensional integrals. We 
shall now examine this important analytic device from the general point 
of view of Lebesgue integration in and we shall see that a number of 
interesting issues arise. 

In general, we may write as a product 

jjd _ ijjdi ^ where d = di + d2, and di,d2 > 1 . 

A point in then takes the form {x,y), where x G and y G R‘^=^. 
With such a decomposition of R"^ in mind, the general notion of a slice, 
formed by fixing one variable, becomes natural. If / is a function in 
Rdi ^ ^d2 ^ slice of / corresponding to y G R*^^ is the function of 
the X G R'^^ variable, given by 

fy{x) = f{x,y). 

Similarly, the slice of / for a fixed x G R"^^ is fx{y) = f{x, y). 

In the case of a set F C R*^^ x R'^^ we define its slices by 

= {x G R'^^ : (x,y) G E) and E^ = {y ^ R”*^ : {x,y) G E}. 

See Figure 1 for an illustration. 



Figure 1. Slices E^ and Ex (for fixed x and y) of a set E 


3.1 Statement and proof of the theorem 

That the theorem that follows is not entirely straightforward is clear 
from the first difficulty that arises in its formulation, involving the mea- 
surability of the functions and sets in question. In fact, even with the 
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assumption that / is measurable on it is not necessarily true that 
the slice is measurable on for each y, nor does the corresponding 
assertion necessarily hold for a measurable set: the slice may not 
be measurable for each y. An easy example arises in by placing a 
one-dimensional non-measurable set on the x-axis; the set E in R^ has 
measure zero, but Ey is not measurable for y = 0. What saves us is that, 
nevertheless, measurability holds for almost all slices. 

The main theorem is as follows. We recall that by definition all inte- 
grable functions are measurable. 

Theorem 3.1 Suppose f{x,y) is integrable on R'^^ x R'^^. Then for al- 
most every y G R'^^ ; 

(i) The slice fy is integrable on R'^’^ . 

(ii) The function defined by f^{x)dx is integrable on R'^^. 
Moreover: 

(iii) / ( [ f{x,y)dx\ dy= [ f. 

Clearly, the theorem is symmetric in x and y so that we also may conclude 
that the slice f^ is integrable on R'^^ for a.e. x. Moreover, fxiy) dy 
is integrable, and 

/ ( [ f{x, y) dy) dx= [ f. 

In particular, Fubini’s theorem states that the integral of / on R'* can 
be computed by iterating lower-dimensional integrals, and that the iter- 
ations can be taken in any order 

[ ( [ f{x,y)dx) dy= [ ( [ f{x,y)dy) dx = [ f. 

jR<i2 VJRdl / JMdl \jTS.'i2 J jRd 

We first note that we may assume that / is real- valued, since the 
theorem then applies to the real and imaginary parts of a complex- valued 
function. The proof of Fubini’s theorem which we give next consists of a 
sequence of six steps. We begin by letting T denote the set of integrable 
functions on R'^ which satisfy all three conclusions in the theorem, and 
set out to prove that L^(R‘’*) C T . 

We proceed by first showing that T is closed under operations such 
as linear combinations (Step 1) and limits (Step 2). Then we begin to 
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construct families of functions in T . Since any integrable function is the 
“limit” of simple functions, and simple functions are themselves linear 
combinations of sets of finite measure, the goal quickly becomes to prove 
that xe belongs to T whenever ill is a measurable subset of with 
finite measure. To achieve this goal, we begin with rectangles and work 
our way up to sets of type Gs (Step 3), and sets of measure zero (Step 4). 
Finally, a limiting argument shows that all integrable functions are in T . 
This will complete the proof of Fubini’s theorem. 


1. Any finite linear combination of functions in T also belongs 

to T . 

Indeed, let {fk}k=i ^ ■ For each k there exists a set Ak C of 
measure 0 so that is integrable on whenever y ^ Ak. Then, if 
A = IJ^=i the set A has measure 0, and in the complement of A, 
the y-slice corresponding to any finite linear combination of the fk is 
measurable, and also integrable. By linearity of the integral, we then 
conclude that any linear combination of the /^’s belongs to T . 

Step 2. Suppose {fk} is a sequence of measurable functions in T so 
that fk y f or fk\ /, where / is integrable (on R'^). Then f ^ T. 

By taking —fk instead of fk if necessary, we note that it suffices to 
consider the case of an increasing sequence. Also, we may replace fk 
by fk — fi and assume that the fkS are non-negative. Now, we observe 
that an application of the monotone convergence theorem (Corollary 1.9) 
yields 


(9) 


lim / fk{x,y)dxdy 

fe^oo J^d 


f{x,y) dxdy. 


By assumption, for each k there exists a set Ak C , so that fj{ is 
integrable on whenever y ^ Ak- If A = Ak, then m(A) = 0 in 

and if y ^ A, then f^ is integrable on for all k, and, by the 

monotone convergence theorem, we find that 



increases to a limit 



f^{x) dx 


as k tends to infinity. By assumption, each gk{y) is integrable, so that 
another application of the monotone convergence theorem yields 


( 10 ) 


9k{y)dy 


g{y) dy as A: — > oo. 
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By the assumption that G ^ we have 


9 k{y)dy= / fk{x,y)dxdy, 


and combining this fact with (9) and (10), we conclude that 


9 {y)dy= / f{x,y)dxdy. 


JS.‘‘2 Js.d 

Since / is integrable, the right-hand integral is finite, and this proves that 
g is integrable. Consequently g{y) < oo a.e. y, hence is integrable for 
a.e. y, and 


f{x,y) dx dy 


f{x,y)dxdy. 


This proves that / € .F as desired. 

Step 3. Any characteristic function of a set E that is a Gs and of finite 
measure belongs to T . 

We proceed in stages of increasing order of generality. 

(a) First suppose E is a bounded open cube in such that E = Qi x 
Q 2 , where Qi and <32 are open cubes in and respectively. Then, 
for each y the function XE{x,y) is measurable in x, and integrable with 



XE{x,y)dx 


\Qi 

0 


if 2/ e Q2, 
otherwise. 


Consequently, g = \Qi\xq -2 is also measurable and integrable, with 



9{y) dy 


IQi| IQ2 


Since we initially have XE{x,y) dx dy = \E\ = \Q\\ \Q 2 \^ we deduce 
that xe G d- . 

(b) Now suppose E is a subset of the boundary of some closed cube. 
Then, since the boundary of a cube has measure 0 in K.'^, we have 
/gd XE{x,y) dxdy = 0. 

Next, we note, after an investigation of the various possibilities, that 
for almost every y, the slice Ey has measure 0 in and therefore if 
9(9) ~ fudi XE{x,y) dx we have g{y) = 0 for a.e. y. As a consequence, 
giy) dy = 0, and therefore xe G 
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(c) Suppose now E is a, finite union of closed cubes whose interiors are 
disjoint, E = IJj^i Qk- Then, if Qk denotes the interior of Qk, we may 
write xe as a linear combination of the Xq^. and XAk where Ak is a 
subset of the boundary of Qfe for A: = 1, . . . ,K. By our previous analysis, 
we know that XQk XA^ belong to T for all fc, and since Step 1 
guarantees that T is closed under finite linear combinations, we conclude 
that Xe G as desired. 

(d) Next, we prove that if E is open and of finite measure, then xe G 
E . This follows from taking a limit in the previous case. Indeed, by 
Theorem 1.4 in Chapter 1, we may write E as a countable union of 
almost disjoint closed cubes 


OO 

E=[jQ,. 

3 = 1 

^ 

Consequently, if we let fk = X/j=i XQ, ; then we note that the functions 
fk increase to / = xe, which is integrable since m{E) is finite. Therefore, 
we may conclude by Step 2 that f ^ E. 

(e) Finally, if is a Gs of finite measure, then xe G ^ ■ Indeed, by 
definition, there exist open sets such that 

OO 

E=f]dk. 

k=l 

Since E has finite measure, there exists an open set Oq of finite measure 
with E C Oq. If we let 


k 

i=i 

then we note that we have a decreasing sequence of open sets of finite 
measure Oi Z3 O 2 A ■ ■ ■ with 


OO 

E=^Ok. 

k=l 

Therefore, the sequence of functions fk = XOk decreases to / = xe, and 
since xOk G ^ all k by (d) above, we conclude by Step 2 that xe 
belongs to E. 

Step 4. If E has measure 0, then xe belongs to E. 
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Indeed, since E is measurable, we may choose a set G of type Gs with 
E C G and m{G) = 0 (Corollary 3.5, Chapter 1). Since XG ^ ^ (by the 
previous step) we find that 


Therefore 



XG{x,y)dx 


dy 


XG = 0. 


XG{x,y)dx = 0 


for a.e. y. 


Consequently, the slice G^ has measure 0 for a.e. y. The simple obser- 
vation that E^ C G^ then shows that E^ has measure 0 for a.e. y, and 
/gdi XE{x,y) dx = 0 for a.e. y. Therefore, 



XE{x,y)dx 


dy = 0 


Xe, 


and thus xe S d-, as was to be shown. 

Step 5. If E is any measurable subset of R'* with finite measure, then 
Xe belongs to T . 

To prove this, recall first that there exists a set of finite measure G of 
type Ga, with E <Z G and m{G — E) = t). Since 


Xe = XG - XG-E, 


and J- is closed under linear combinations, we find that xe £ d- ^ as 
desired. 

Step 6. This is the final step, which consists of proving that if / is 
integrable, then f G E. 

We note first that / has the decomposition / = /“*' — /“, where both /+ 
and /“ are non-negative and integrable, so by Step 1 we may assume 
that / is itself non- negative. By Theorem 4.1 in the previous chapter, 
there exists a sequence {(fk} of simple functions that increase to /. Since 
each ipk is a finite linear combination of characteristic functions of sets 
with finite measure, we have ipk G E hy Steps 5 and I, hence f a E hy 
Step 2. 

3.2 Applications of Fubini’s theorem 

Theorem 3.2 Suppose f{x,y) is a non-negative measurable function on 
jjdi ^ _ Then for almost every y G ; 
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(i) The slice p is measurable on . 

(ii) The function defined by f^{x)dx is measurable 
Moreover: 


(iii) 


f{x,y)dx dy 


f{x, y) dx dy in the extended sense. 


In practice, this theorem is often used in conjunction with Fubini’s 
theorem.^ Indeed, suppose we are given a measurable function / on 
and asked to compute /. To justify the use of iterated integration, we 
first apply the present theorem to |/|. Using it, we may freely compute 
(or estimate) the iterated integrals of the non- negative function |/|. If 
these are finite. Theorem 3.2 guarantees that / is integrable, that is, 
J" I/I < oo. Then the hypothesis in Fubini’s theorem is verified, and we 
may use that theorem in the calculation of the integral of /. 

Proof of Theorem 3.2. Consider the truncations 


fk{x,y) 


f{x,y) if \{x,y)\ < k and f{x,y) < k, 
0 otherwise. 


Each fk is integrable, and by part (i) in Fubini’s theorem there exists a 
set Ek C of measure 0 such that the slice ff{x) is measurable for all 
y e E^. Then, if we set E = |J^ Ek, we find that P{x) is measurable for 
all y a E^ and all k. Moreover, m{E) = 0. Since ff /’ p, the monotone 
convergence theorem implies that ii y ^ E, then 


fk{x,y)dx P 


f{x,y)dx as /c — > oo. 


Again by Fubini’s theorem, fk{x,y) dx is measurable for all y G E^, 
hence so is f{x, y) dx. Another application of the monotone conver- 
gence theorem then gives 


(11) f ( f fk{x,y)dx] dy^ f ( f f{x,y)dx] dy. 

7R‘i2 VJR'il / jRd2 VJRtil / 

By part (iii) in Fubini’s theorem we know that 

(12) [ ( [ fk{x,y)dx'\ dy= [ fk. 

Jm'i2 \Jr'^i J jR'i 


^Theorem 3.2 was formulated by Tonelli. We will, however, use the short-hand of 
referring to it, as well as Theorem 3.1 and Corollary 3.3, as Fubini’s theorem. 
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A final application of the monotone convergence theorem directly to fk 
also gives 

( 13 ) [ [ /. 

Js.‘‘ Js.‘‘ 

Combining (11), (12), and (13) completes the proof of Theorem 3.2. 

Corollary 3.3 If E is a measurable set in x then for almost 
every y G the slice 


Ey = {xe : {x,y) G E} 

is a measurable subset ofM.'^^. Moreover m{Ey) is a measurable function 
of y and 

m{E) = / m{Ey)dy. 

Jr '^2 

This is an immediate consequence of the first part of Theorem 3.2 applied 
to the function xe- Clearly a symmetric result holds for the x-slices in 

We have thus established the basic fact that if E is measurable on 
Rdi ^ ^ for almost every y G R'^^ the slice Ey is measurable in 

R'^'^ (and also the symmetric statement with the roles of x and y inter- 
changed). One might be tempted to think that the converse assertion 
holds. To see that this is not the case, note that if we let N denote a 
non-measurable subset of R, and then define 

= [0, 1] X AT C R X R, 


we see that 

T?y - \ [O’ 1] if ^ 

\ 0 iiy ^ U. 

Thus Ey is measurable for every y. However, if E were measurable, then 
the corollary would imply that E^ = {y & ^ (x, y) G E} is measurable 

for almost every x G R, which is not true since Ex is equal to J\f for all 
X G [0,1]. 

A more striking example is that of a set E in the unit square [0,1] x 
[0, 1] that is not measurable, and yet the slices Ey and Ex are measurable 
with m{Ey) = 0 and m{Ex) = 1 for each x,?/ G [0, 1]. The construction 
of E is based on the existence of a highly paradoxical ordering ^ of 
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the reals, with the property that {x : x ~< y} is a. countable set for each 
y eR. (The construction of this ordering is discussed in Problem 5.) 
Given this ordering we let 

E = {{x,y) € [0, 1] X [0, 1], with x ^ y}. 

Note that for each y e [0,1], Ey = {x : x ^ y}; thus Ey is countable and 
m{Ey) = 0. Similarly m{Ex) = 1, because En^ is the complement of a 
denumerable set in [0, 1]. If E were measurable, it would contradict the 
formula in Corollary 3.3. 

In relating a set E to its slices iHa, and Ey, matters are straightforward 
for the basic sets which arise when we consider as the product x 
These are the product sets E = Ei x E 2 , where Ej C . 

Proposition 3.4 If E = Ei x E 2 is a measurable subset of and 
™*(T' 2 ) > 0, then Ei is measurable. 

Proof. By Corollary 3.3, we know that for a.e. y G the slice 
function 

{XE^xE2V{x) = XE^{x)XE2{y) 

is measurable as a function of x. In fact, we claim that there is some 
y (z E 2 such that the above slice function is measurable in x; for such a 
y we would have XEixE 2 {x,y) = XEi{x), and this would imply that Ei 
is measurable. 

To prove the existence of such a y, we use the assumption that m* {E 2 ) > 
0. Indeed, let E denote the set of y G such that the slice Ey is 
measurable. Then m{F'^) = 0 (by the previous corollary). However, 
i ?2 n T is not empty because m^,{E 2 n T) > 0. To see this, note that 
E 2 = {E 2 n F) \J{E 2 n F'^), hence 

0 < m*(£'2) < m^,{E2 H F) + m*(iil2 H F'^) = m*(iil2 H F), 

because E 2 n is a subset of a set of measure zero. 

To deal with a converse of the above result, we need the following 
lemma. 

Lemma 3.5 If E\ C. and E 2 C then 

m*(Fi X E 2 ) < m*(Fi) m*(F2), 

with the understanding that if one of the sets Ej has exterior measure 
zero, then m*(Fi x E 2 ) = 0. 
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Proof. Let e > 0. By definition, we can find cubes {Qk}^=i in 
and in such that 


-Li C U Qk, 

k=l 


oo 

and E2 C \^Q'^ 
e=i 


and 


CXD 

< + e and 

k=l 


\Q'(\ < m*(-®2) + e- 

i=i 


Since E-^x E 2 C. U^^=i Qk x Q'l-, the sub- additivity of the exterior mea- 
sure yields 


m^{Ei X E2) < ^ \Qk X Cff\ 

k,e=i 

00 \ / OO 

k=i ) Vfci 

< (m*(£'i) -h e)(m*(i?2) + e)- 

If neither Ei nor E2 has exterior measure 0, then from the above we find 
X E 2 ) < m^.{Ei) m*(i?2) + 0{e), 

and since e is arbitrary, we must have x E2) < m^,{Ei) m*(i?2)- 

If for instance m^:{Ei) = 0, consider for each positive integer j the 
set E2 = E2r\ {y e : \y\ < j}. Then, by the above argument, we 
find that mt.{Ei x E^) = 0. Since {Ei x E2) /’ {Ei x E2) as j — > 00, we 
conclude that m^,{Ei x E2) = 0. 

Proposition 3.6 Suppose Ei and E2 are measurable subsets and 

respectively. Then E = Ei x E2 is a measurable subset ofW^. More- 
over, 

m{E) = m{Ei) m(i?2), 

with the understanding that if one of the sets Ej has measure zero, then 
m{E) = 0. 

Proof. It suffices to prove that E is measurable, because then the 
assertion about m{E) follows from Corollary 3.3. Since each set Ej is 
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measurable, there exist sets Gj C of type Gs, with Gj D Ej and 
m^,{Gj — Ej) = 0 for each j = 1,2. (See Corollary 3.5 in Chapter 1.) 
Clearly, G = Gi x G 2 is measurable in x and 

(Gi X G 2 ) - {El X E 2 ) C ((Gi - El) X G 2 ) U (Gi x (G 2 - E 2 )) . 

By the lemma we conclude that m^,{G — E) = 0, hence E is measurable. 


As a consequence of this proposition we have the following. 

Corollary 3.7 Suppose f is a measurable funetion on Then the 

funetion f defined by f{x,y) = f{x) is measurable on x 

Proof. To see this, we may assume that / is real- valued, and recall 
first that if a S R and Ei = {x a R"^^ : f{x) < a}, then Ei is measurable 
by definition. Since 

{{x,y) e R'^^ X R'^^ : f{x,y) < a} = Ei x R'^^, 


the previous proposition shows that {f{x,y) < a} is measurable for each 
a G R. Thus f{x,y) is a measurable function on R'^i x R'^^, as desired. 

Finally, we return to an interpretation of the integral that arose first in 
the calculus. We have in mind the notion that f f describes the “area” 
under the graph of /. Here we relate this to the Lebesgue integral and 
show how it extends to our more general context. 

Corollary 3.8 Suppose f{x) is a non-negative function on R'^, and let 
A = {{x,y) e X R : 0<y<f{x)}. 

Then: 

(i) / is measurable on R“* if and only if A is measurable in 

(ii) If the conditions in (i) hold, then 



f{x) dx 


m{A). 


Proof. If / is measurable on R"^, then the previous proposition guar- 
antees that the function 


E{x,y) = y- f{x) 
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is measurable on so we immediately see that ^ = {j/ > 0} n {F < 

0} is measurable. 

Conversely, suppose that A is measurable. We note that for each 
X e the slice ^3, = {j/ G R : {x,y) G is a closed segment, namely 
•^x = [0,/(a;)]. Consequently Corollary 3.3 (with the roles of x and y 
interchanged) yields the measurability of m{Ax) = f{x). Moreover 


m{A) 


XA{x,y) dx dy 


m{Ax) dx 


f{x) dx, 


as was to be shown. 

We conclude this section with a useful result. 

Proposition 3.9 If f is a measurable function on then the function 
f{x, y) = f{x — y) is measurable on x 

By picking i? = {z G R'^ : f{z) < a}, we see that it suffices to prove 
that whenever ill is a measurable subset of R'^, then E = {{x,y) : x — y & 
iH} is a measurable subset of R'^ x R'^. 

Note first that if O is an open set, then O is also open. Taking count- 
able intersections shows that if iH is a Gs set, then so is E. Assume 
now that m{Ek) = 0 for each k, where Ek = E H and = {|?/| < k}. 
Again, take O to be open in R'^, and let us calculate m{0 D Bk). We 
have that XonB^ = Xo{x - y)XBk{y)- Hence 


m(d nBk) = j xo{x - y)XBki.y) dydx 

j Xo{x-y)dx^ XBkiy) dy 
= m{0) m{Bk), 

by the translation-invariance of the measure. Now if m{E) = 0, there is 
a sequence of open sets On such that E C On and m{On) — > 0. It follows 
from the above that E^ C On H B^ and m{On H — > 0 in n for each 
fixed k. This shows m{Ek) = 0, and hence m{E) = 0. The proof of the 
proposition is concluded once we recall that any measurable set E can 
be written as the difference of a Gs and a set of measure zero. 



4* A Fourier inversion formula 

The question of the inversion of the Fourier transform encompasses in 
effect the problem at the origin of Fourier analysis. This issue involves 
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establishing the validity of the inversion formula for a function / in terms 
of its Fourier transform /, that is, 


(14) 

m= 

/ /(x)e-2™-« dx. 

(15) 

f{x)= 

/ /(C)e2"“-«dC. 


We have already encountered this problem in Book I in the rudimen- 
tary case when in fact both / and / were continuous and had rapid (or 
moderate) decrease at inhnity. In Book II we also considered the ques- 
tion in the one-dimensional setting, seen from the viewpoint of complex 
analysis. The most elegant and useful formulations of Fourier inversion 
are in terms of the theory, or in its greatest generality stated in the 
language of distributions. We shall take up these matters systematically 
later. It will, nevertheless, be enlightening to digress here to see what 
our knowledge at this stage teaches us about this problem. We intend to 
do this by presenting a variant of the inversion formula appropriate for 
L^, one that is both simple and adequate in many circumstances. 

To begin with, we need to have an idea of what can be said about the 
Fourier transform of an arbitrary function in 

Proposition 4.1 Suppose f S Then f defined by (14) is con- 

tinuous and bounded on 

In fact, since \f{x)e~^'^^^'^\ = \f{x)\, the integral representing / con- 
verges for each ^ and sup^gR^ \ < Jr^ |/(a;)| dx = ||/||. To verify the 

continuity, note that for every x, /(x)e“^’^*®'^ — > /(x)e“^’^“'^° as ^ > ^o, 

where is any point in hence /(^) — > /(Co) by the dominated con- 
vergence theorem. 

One can assert a little more than the boundedness of /; namely, one 
has /(C) 0 as |C| — > oo, but not much more can be said about the 

decrease at inhnity of f. (See Exercises 22 and 25.) As a consequence, 
for general / S T^(R“*) the function / is not in and the presumed 

formula (15) becomes problematical. The following theorem evades this 
difficulty and yet is useful in a number of situations. 

Theorem 4.2 Suppose f € and assume also that f G T^(IR‘^). 

Then the inversion formula (15) holds for almost every x. 

An immediate corollary is the uniqueness of the Fourier transform 
on L^. 


^The theory will be dealt with in Chapter 5, and distributions will be studied in 
Book IV. 
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Corollary 4.3 Suppose /(^) = 0 for all Then / = 0 a.e. 

The proof of the theorem requires only that we adapt the earlier argu- 
ments carried out for Schwartz functions in Chapter 5 of Book I to the 
present context. We begin with the “multiplication formula.” 

Lemma 4.4 Suppose f and g belong to Then 

[ f f{y)g{y)dy. 

Note that both integrals converge in view of the proposition above. Con- 
sider the function F{^,y) = g{f)f{y)e~‘^'"^^'y defined for {f,y) e x 
It is measurable as a function on R^“* in view of Corollary 3.7. 
We now apply Fubini’s theorem to observe first that 

/ / \Fi^,y)\d^dy= [ \g{^)\df [ \f{y)\dy<oo. 

JR'i Jr'^ Js.'i 


Next, if we evaluate F{^, y) d^ dy by writing it as y) di) dy 

we get the left-hand side of the desired equality. Evaluating the double 
integral in the reverse order gives as the right-hand side, proving the 
lemma. 

Next we consider the modulated Gaussian, g{^) = ^ 2 -nix-^^ where 

for the moment 5 and x are fixed, with (5 > 0 and x G R'^. An elementary 
calculation gives® 



g-7r<5|5|^g27ri(x-y).§ 


(^-<i/2e-7r|a:-y|^/i5. 


which we will abbreviate as [x — y). We recognize as a “good 
kernel” that satisfies: 


(i) f Ks{y)dy=l. 
Jr'^ 


(ii) For 
Applying 


each g > 0, / Ks{y) dy 0 as 5 0. 

J\v\>ri 


'\y\>ri 

the lemma gives 


(16) / / f{y)Ks{x-y)dy. 


^See for example Chapter 6 in Book I. 
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Note that since / G the dominated convergence theorem shows 

that the left-hand side of (16) converges to ^ as 6 ^ 0, for 

each X. As for the right-hand side, we make two successive change of vari- 
ables y ^ y + X {a, translation), and y — > —y (a reflection), and take into 
account the corresponding invariance of the integrals (see equations (4) 
and (5)). Thus the right-hand side becomes f{x — y)Ks{y) dy, and 
we will prove that this function converges in the L^-norm to / as 5 — > 0. 
Indeed, we can write the difference as 

^s{x) = f f{x - y)Ks{y)dy - f{x) = f {f{x-y)-f{x))Ks{y)dy, 

because of property (i) above. Thus 

|A 5 (a:)| < [ \f{x -y) - f{x)\Ks{y)dy. 

We can now apply Fubini’s theorem, recalling that the measurability 
of f{x) and f{x — y) on x R'* are established in Corollary 3.7 and 
Proposition 3.9. The result is 

IIA^II < / Wfy- f\\K5{y)dy, where fy{x) = f{x - y). 

Now, for given e > 0 we can find (by Proposition 2.5) ry > 0 so small such 
that Wfy — /II < e when |j/| < ry. Thus 

\\As\\<e+f Wfy - f\\Ks{y)dy < e + 2\\f\\ f Ks{y)dy. 

J\y\>r) d\y\>y 

The first inequality follows by using (i) again; the second holds because 
Wfy ~ /II ^ ll/yll + ll/ll = 2||/||. Therefore, with the use of (ii), the com- 
bination above is < 2e if <5 is sufficiently small. To summarize: the right- 
hand side of (16) converges to / in the T^-norm as d — > 0, and thus 
by Corollary 2.3 there is a subsequence that converges to f{x) almost 
everywhere, and the theorem is proved. 

Note that an immediate consequence of the theorem and the proposi- 
tion is that if / were in then / could be modified on a set of measure 
zero to become continuous everywhere. This is of course impossible for 
the general / G 

5 Exercises 

1 . Given a collection of sets F\, F 2 , . . . , Fn, construct another collection F* , Ff , . . . , Ff , 
with N = 2" — 1, so that Ui=i Ffc = U^i F/i the collection {F*} is disjoint; also 
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Fk = Uf*cf. Fj, for every k. 

j fc 

[Hint: Consider the 2" sets Fi O F 2 O ■ ■ ■ O F^ where each Fl, is either Fk or F^.] 

2. In analogy to Proposition 2.5, prove that if / is integrable on R'* and <5 > 0, 
then f{Sx) converges to f{x) in the L^-norm as <5 — > 1. 

3. Suppose / is integrable on (— f, tt] and extended to R by making it periodic of 
period 2 f. Show that 




dx, 


where I is any interval in R of length 2 f. 

[Hint: I is contained in two consecutive intervals of the form {kir, (k + 2)f).] 


4. Suppose / is integrable on [0,6|, and 



for 0 < X < b. 


Prove that g is integrable on [0, b] and 




dt. 


5. Suppose F is a closed set in R, whose complement has Hnite measure, and let 
5{x) denote the distance from x to F, that is, 

(5(x) = d{x^ F) = infll® — y\ ’■ y G F}. 


Consider 



s{y) 

\X - J/|2 


dy. 


(a) Prove that S is continuous, by showing that it satishes the Lipschitz condi- 
tion 


|(5(a;) - 5(y)| < \x - y\. 


(b) Show that I{x) = 00 for each x ^ F. 

(c) Show that I{x) < 00 for a.e. x £ F. This may be surprising in view of the 
fact that the Lispshitz condition cancels only one power of \x — y\ in the 
integrand of 7. 
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[Hint: For the last part, investigate Jp I{x) dx.] 

6. Integrability of / on R does not necessarily imply the convergence of f{x) to 0 
as X — > oo. 

(a) There exists a positive continuous function / on R so that / is integrable 
on R, but yet limsup 2 ,^oo /(®) = oo- 

(b) However, if we assume that / is uniformly continuous on R and integrable, 
then lim|,,|^oo f(x) = 0. 

[Hint: For (a), construct a continuous version of the function equal to n on the 
segment [n, n+ 1/n^), n > 1.] 

7. Let F C R'* X R, r = {(x, j/) G R”^ X R : y — /(x)}, and assume / is measurable 

on R'^. Show that F is a measurable subset of and m(r) = 0. 

8. If / is integrable on R, show that F{x) = fit) dt is uniformly continuous. 

9. Tchebychev inequality. Suppose / > 0, and / is integrable. If a > 0 and 
Ea = {x : f{x) > a}, prove that 



10. Suppose / > 0, and let £ 2 '= = {x ■ fix) > 2*^} and FI, = {x : 2* < /(x) < 
2 *;+!}^ If / is finite almost everywhere, then 


00 


y F-, = {/(x) > 0}, 


and the sets Ft are disjoint. 

Prove that / is integrable if and only if 


00 


00 


2*^m(Ffe) < 00 , if and only if 2 ^miE 2 k) < 00 . 



Use this result to verify the following assertions. Let 



and 



Then / is integrable on R'* if and only if a < d; also g is integrable on R'* if and 
only if & > d. 


11. Prove that if / is integrable on R'*, real- valued, and /(x) dx > 0 for ev- 
ery measurable E, then /(x) > 0 a.e. x. As a result, if fix) dx = 0 for every 
measurable E, then /(x) = 0 a.e. 
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12 . Show that there are / C and a sequence {fn} with /„ £ such 

that 


11/ - Mini ^0, 


but fn{x) — > f{x) for no x. 

[Hint: In R, let /„ = XGi where In is an appropriately chosen sequence of intervals 
with m{In) —> 0.] 

13 . Give an example of two measurable sets A and B such that A + B is not 
measurable. 

[Hint: In R^ take A = {0} x [0, 1[ and B — Af x {0}.] 

14 . In Exercise 6 of the previous chapter we saw that m{B) = Vdx'^, whenever B 
is a ball of radius r in R”^ and Vd = m{Bi), with Bi the unit ball. Here we evaluate 
the constant Vd- 

(a) For 4 = 2, prove using Corollary 3.8 that 

V2 = 2 J {l — x^)^^^dx, 

and hence by elementary calculus, that W 2 = rr. 

(b) By similar methods, show that 


Vd = 2vd-i [ 

Jo 


(c) The result is 


“ r(d/2 + 1) ■ 


Another derivation is in Exercise 5 in Chapter 6 below. Relevant facts about the 
gamma and beta functions can be found in Chapter 6 of Book H. 


15 . Consider the function defined over R by 

fix) = 

For a fixed enumeration {rn}^^i of the rationale Q, let 


X if 0 < 2: < 1, 
0 otherwise. 


n = l 
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Prove that F is integrable, hence the series defining F converges for almost every 
a; £ R. However, observe that this series is unbounded on every interval, and in 
fact, any function F that agrees with F a.e is unbounded in any interval. 

16. Suppose / is integrable on R"^. If 5 = (iJi, . . . , 5d) is a d-tuple of non-zero real 
numbers, and 


= fiSx) = f{SiXi, SdXd), 


show that is integrable with 



17. Suppose / is defined on R^ as follows: f{x, y) = if n < x < n + 1 and n < 
y < n + 1, [n > 0)- f{x, y) = —a„ ifn<x<n + l and n -|- 1 < j/ < n -|- 2, (n > 0); 
while f{x,y) = 0 elsewhere. Here a-n = with {&*,} a positive sequence 

such that bk = s < oo. 

(a) Verify that each slice and /j, is integrable. Also for all x, J fx{y) dy = 0, 
and hence J (J f{x,y) dy'j dx = 0. 

(b) However, J f^{x) dx = ao if 0 < y < 1, and / f^{x) dx = an — a„_i if n < 
y < n + 1 with n > 1. Hence y f f^{x) dx is integrable on (0, oo) and 



(c) Note that \f{x, y)\ dx dy = oo. 


18. Let / be a measurable finite- valued function on [0, 1], and suppose that \f{x) — 
f{y)\ is integrable on [0, 1] x [0, 1]. Show that f{x) is integrable on [0, 1]. 

19. Suppose / is integrable on R'*. For each a > 0, let Ea = {x : \f{x)\ > a}. 
Prove that 



20. The problem (highlighted in the discussion preceding Fubini’s theorem) that 
certain slices of measurable sets can be non-measurable may be avoided by re- 
stricting attention to Borel measurable functions and Borel sets. In fact, prove the 
following: 

Suppose S is a Borel set in R^. Then for every y, the slice is a Borel set in 
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[Hint: Consider the collection C of subsets E of with the property that each 
slice is a Borel set in R. Verify that C is a c-algebra that contains the open 
sets.] 

21 . Suppose that / and g are measurable fuuctions on R'*. 

(a) Prove that f{x — y)g{y) is measurable on R^'*. 

(b) Show that if / and g are integrable on R'*, then f{x — y)g{y) is integrable 
on R^"*. 

(c) Recall the definition of the convolution of / and g given by 

{f*9){x)= f[x-y)g{y)dy. 

JtLd- 

Show that f * g is well defined for a.e. x (that is, f(x — y)g{y) is integrable 
on R"^ for a.e. x). 

(d) Show that f * g is integrable whenever / and g are integrable, and that 

11/ * 5llLl{K‘i) ^ ll/llLl(H'i) llsIliRR'i)! 
with equality if / and g are non-negative. 

(e) The Fourier transform of an integrable function / is defined by 

f{0= f dx. 


Check that / is bounded and is a continuous function of Prove that for 
each ^ one has 

= fiOaiO- 


22 . Prove that if / G I/^(R‘^) and 


m= [ fix)e-^^^^^dx, 

Jr‘>' 

then /(^) ^ 0 as j^j ^ oo. (This is the Riemann-Lebesgue lemma.) 

[Hint: Write f{^) = | /jjd[/(®) — f{x — dx, where and use 

Proposition 2.5.] 

23 . As an application of the Fourier transform, show that there does not exist a 
function I G I/^(R'^) such that 

f*I = f for all / G //^(R"^). 
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24. Consider the convolution 



(a) Show that f * g is uniformly continuous when / is integrable and g bounded. 

(b) If in addition g is integrable, prove that (/ * g){x) — > 0 as \x\ oo. 

25. Show that for each e > 0 the function F{^) = is the Fourier transform 

of an function. 

[Hint: With Ks{x) = consider f{x) = Ks{x)e-^^ 5^-^ dS. Use 

Fubini’s theorem to prove / € and 



and evaluate the last integral as tt '‘r(£) . Here r(s) is the gamma function 

defined by r(s) = dt.] 


6 Problems 

1 . If / is integrable on [0, 27r], then /(a:)e“*"^ di — > 0 as jnj ^ oo. 

Show as a consequence that if if is a measurable subset of [0, 27r], then 



for any sequence {«„}. 
[Hint: See Exercise 22. [ 


2. Prove the Cantor-Lebesgue theorem: if 


OO 


OO 



converges for a; in a set of positive measure (or in particular for all x), then a„ — > 0 
and 6n — > 0 as n ^ 00 . 

[Hint: Note that A„(x) — > 0 uniformly on a set E of positive measure. [ 

3. A sequence {fk} of measurable functions on R'* is Cauchy in measure if for 
every e > 0, 


m{{x : \fk{x) - fiix)\ > e}) ^ 0 as k,£ ^ oo. 
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We say that {fk} converges in measure to a (measurable) function / if for every 
e > 0 

m{{x : \ fk{x) — f{x)\ > e}) ^ 0 as fe ^ oo. 

This notion coincides with the “convergence in probability” of probability theory. 

Prove that if a sequence {fk} of integrable functions converges to f in L^, then 
{fk} converges to / in measure. Is the converse true? 

We remark that this mode of convergence appears naturally in the proof of 
Egorov’s theorem. 


4. We have already seen (in Exercise 8, Chapter 1) that if E is a measurable set 
in R'*, and L is a linear transformation of R'’* to R'*, then L{E) is also measurable, 
and if E has measure 0, then so has L{E). The quantitative statement is 

m{L{E)) = I det(L)| m{E). 


As a special case, note that the Lebesgue measure is invariant under rotations. 
(Eor this special case see also Exercise 26 in the next chapter.) 

The above identity can be proved using Eubini’s theorem as follows. 


(a) Consider hrst the case d = 2, and L a “strictly” upper triangular transfor- 
mation x' = X -\- ay, y' = y. Then 

XL(E){x,y) = XEiL~^{x,y)) = xe{x - ay,y). 


Hence 


m{L{E)) = 


/ (/ 

-I (/ 

= rn(E), 


Xe{x- ay,y) j dy 
XE(.x,y)dx] dy 


by the translation-invariance of the measure. 

(b) Similarly m{L{E)) = m(E) if L is strictly lower triangular. In general, one 
can write L = L 1 AL 2 , where Lj are strictly (upper and lower) triangular 
and A is diagonal. Thus Tn{L[E)) = | Aet[L)\m(E), if one uses Exercise 7 
in Chapter 1. 


5. There is an ordering ^ of R with the property that for each 1 / G R the set 
{a; G R : x R i/} is at most countable. 

The existence of this ordering depends on the continuum hypothesis, which 
asserts: whenever S is an infinite subset of R, then either S is countable, or S has 
the cardinality of R (that is, can be mapped bijectively to R).® 


®This assertion, formulated by Cantor, is like the well-ordering principle independent 
of the other axioms of set theory, and so we are also free to accept its validity. 
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[Hint: Let ^ denote a well-ordering of R, and define the set X hy X = {y £ 
R : the set {a; : a; R y} is not countable}. If X is empty we are done. Otherwise, 
consider the smallest element y in X, and use the continuum hypothesis.] 



Differentiation and Integrati 


ion 


The Maximal Problem: 


The problem is most easily grasped when stated 
in the language of cricket, or any other game in which 
a player compiles a series of scores of which an average 
is recorded. 


G. H. Hardy and J. E. Littlewood, 1930 


That differentiation and integration are inverse operations was already 
understood early in the study of the calculus. Here we want to reexamine 
this basic idea in the framework of the general theory studied in the 
previous chapters. Our objective is the formulation and proof of the 
fundamental theorem of the calculus in this setting, and the development 
of some of the concepts that occur. We shall try to achieve this by 
answering two questions, each expressing one of the ways of representing 
the reciprocity between differentiation and integration. 

The first problem involved may be stated as follows. 

• Suppose / is integrable on [a, b] and F is its indefinite integral 
F{x) = f{y) dy. Does this imply that F is differentiable (at 
least for almost every x), and that F' = f 1 

We shall see that the affirmative answer to this question depends 
on ideas that have broad application and are not limited to the one- 
dimensional situation. 

For the second question we reverse the order of differentiation and 
integration. 

• What conditions on a function F on [a, b] guarantee that F'{x) ex- 
ists (for a.e. x), that this function is integrable, and that moreover 



While this problem will be examined from a narrower perspective than 
the first, the issues it raises are deep and the consequences entailed are 
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far-reaching. In particular, we shall find that this question is connected 
to the problem of rectifiability of curves, and as an illustration of this 
link, we shall establish the general isoperimetric inequality in the plane. 

1 Differentiation of the integral 

We begin with the first problem, that is, the study of differentiation of 
the integral. If / is given on [a, b] and integrable on that interval, we let 

= [ f{y)dy, a<x <b. 

J a 

To deal with F'{x), we recall the definition of the derivative as the limit 
of the quotient 


F{x + h) - F{x) 
h 


when h tends to 0. 


We note that this quotient takes the form (say in the case h > 0) 


1 

h 



f{y) dy 


I 

| 7 | 


f{y) dy, 


where we use the notation I = {x,x + h) and |/| for the length of this 
interval. At this point, we pause to observe that the above expression 
is the “average” value of / over I, and that in the limit as |/| — > 0, 
we might expect that these averages tend to f{x). Reformulating the 
question slightly, we may ask whether 


lim 

/I 0 
I e / 


j^f{y)dy = f{x) 


holds for suitable points x. In higher dimensions we can pose a similar 
question, where the averages of / are taken over appropriate sets that 
generalize the intervals in one dimension. Initially we shall study this 
problem where the sets involved are the balls B containing x, with their 
volume m{B) replacing the length |/| of /. Later we shall see that as a 
consequence of this special case similar results will hold for more general 
collections of sets, those that have bounded “eccentricity.” 

With this in mind we restate our first problem in the context of 
for all d > 1. 
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Suppose / is integrable on Is it true that 


lim 

- 0 
X ^ B 


— / f{y) dy = fix) 


for a.e. x? 


The limit is taken as the volume of open balls B containing 
X goes to 0. 

We shall refer to this question as the averaging problem. We remark 
that if B is any ball of radius r in K.'^, then m{B) = Vd.r'^, where Vd is 
the measure of the unit ball. (See Exercise 14 in the previous chapter.) 

Note of course that in the special case when / is continuous at x , the 
limit does converge to fix). Indeed, given e > 0, there exists d > 0 such 
that |/(x) — fiy)\ < € whenever \x — y\ < 6. Since 

fix) ^ [ fiy)dy = —^ [ ifix) - fiy))dy, 

miB) Jb rniB) 

we find that whenever U is a ball of radius < 6 j2 that contains x, then 


fix) - 


miB) 


fiv) dy 


’ B 


< 


miB) 


\fix) - fiy)\dy < e, 


' B 


as desired. 

The averaging problem has an affirmative answer, but to establish that 
fact, which is qualitative in nature, we need to make some quantitative 
estimates bearing on the overall behavior of the averages of /. This will 
be done in terms of the maximal averages of |/|, to which we now turn. 


1.1 The Hardy-Littlewood maximal function 

The maximal function that we consider below arose first in the one- 
dimensional situation treated by Hardy and Littlewood. It seems that 
they were led to the study of this function by toying with the question 
of how a batsman’s score in cricket may best be distributed to maximize 
his satisfaction. As it turns out, the concepts involved have a universal 
significance in analysis. The relevant definition is as follows. 

If / is integrable on we define its maximal function f* by 

/*(x) = sup f \fiy)\dy, xeW^, 

xeB miB) 

where the supremum is taken over all balls containing the point x. In 
other words, we replace the limit in the statement of the averaging prob- 
lem by a supremum, and / by its absolute value. 
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The main properties of f* we shall need are summarized in a theorem. 
Theorem 1.1 Suppose f is integrable on Then: 

(i) f* is measurable. 

(ii) f*{x) < oo for a.e. x. 

(hi) f* satisfies 

(1) m{{xeR‘^: /*(x) > a}) < - ||/||ii(K<i) 

a ^ 

for all a> 0, where A = 3“^, and ||/||Li(Rd) = /jjd |/(a:;)| dx. 

Before we come to the proof we want to clarify the nature of the main 
conclusion (hi). As we shall observe, one has that f*{x) > \f{x)\ for a.e. 
x; the effect of (hi) is that, broadly speaking, f* is not much larger than 
|/|. From this point of view, we would have liked to conclude that f* is 
integrable, as a result of the assumed integrability of /. However, this 
is not the case, and (iii) is the best substitute available (see Exercises 4 
and 5). 

An inequality of the type (1) is called a weak-type inequality be- 
cause it is weaker than the corresponding inequality for the L^-norms. 
Indeed, this can be seen from the Tchebychev inequality (Exercise 9 in 
Chapter 2), which states that for an arbitrary integrable function g, 

m{{x: | 5 (j;)| > a}) < - Ilffllii/Rd), for all a > 0. 

We should add that the exact value of A in the inequality (1) is unim- 
portant for us. What matters is that this constant be independent of a 
and /. 

The only simple assertion in the theorem is that /* is a measurable 
function. Indeed, the set Ea = {x e : f*{x) > a} is open, because if 
X ^ Ea, there exists a ball B such that x & B and 

Now any point x close enough to x will also belong to H; hence x G E^ 
as well. 

The two other properties of f* in the theorem are deeper, with (ii) 
being a consequence of (iii). This follows at once if we observe that 


{x : f*{x) = oo} C {x : f*{x) > a} 
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for all OL. Taking the limit as a tends to infinity, the third property yields 
m({j; : /*(x) = oo}) = 0. 

The proof of inequality (1) relies on an elementary version of a Vital! 
covering argument.^ 

Lemma 1.2 Suppose B = , Sat} is a finite eollection of open 

balls in Then there exists a disjoint sub-eolleetion Bi^, Bi^, . . . , Bi^^ 
of B that satisfies 


( N \ k 

IJ I < 3'^y^ m{Bifi. 

1=1 / j=i 

Loosely speaking, we may always find a disjoint sub-collection of balls 
that covers a fraction of the region covered by the original collection of 
balls. 

Proof. The argument we give is constructive and relies on the fol- 
lowing simple observation: Suppose B and B' are a pair of balls that 
intersect, with the radius of B' being not greater than that of B. Then 
B' is contained in the ball B that is concentric with B but with 3 times 
its radius. 

As a first step, we pick a ball Bi.^ in B with maximal (that is, largest) 
radius, and then delete from B the ball Bi^ as well as any balls that 
intersect Bi^. Thus all the balls that are deleted are contained in the 
ball Bi^ concentric with Bi^ , but with 3 times its radius. 

The remaining balls yield a new collection B' , for which we repeat the 
procedure. We pick Bi^ with largest radius in B' , and then delete from 
B' the ball Bi^ and any ball that intersects Bi^ . Continuing this way we 
find, after at most N steps, a collection of disjoint balls Bi^ , Bi^ , . . . , Bi^ . 

Finally, to prove that this disjoint collection of balls satisfies the in- 
equality in the lemma, we use the observation made at the beginning of 
the proof. We let Bi. denote the ball concentric with Bi., but with 3 
times its radius. Since any ball B in B must intersect a ball Bi^ and have 
equal or smaller radius than Bi^ , we must have B C Bi^ , thus 

( Af \ / k \ k k 

I < m ( I < '^m{Bifi = 

t=l / \j = l / 3 = 1 3 = 1 


^We note that the lemma that follows is the first of a series of covering arguments that 
occur below in the theory of differentiation; see also Lemma 3.9 and its corollary, as well 
as Lemma 3.5, where the covering assertion is more implicit. 
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Figure 1. The balls B and B 


In the last step we have used the fact that in a dilation of a set by 
(5 > 0 results in the multiplication by 5'^ of the Lebesgue measure of this 
set. 

The proof of (hi) in Theorem 1.1 is now in reach. If we let = {x : 
f*{x) > a}, then for each x € there exists a ball B^ that contains x, 
and such that 


Therefore, for each ball B^ we have 

(2) m{B^)<-f \f{y)\dy. 

Fix a compact subset K of E^- Since K is covered by IJaigB we 
may select a finite subcover of K, say K C IJfci covering lemma 

guarantees the existence of a sub-collection Bi ^ , . . . , Bi^, of disjoint balls 


with 


(3) 

/ AT \ k 

m ( U X] rn{Bi . ) 

\f=i / j=i 
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Since the balls Bi ^ , . . . , are disjoint and satisfy (2) as well as (3), we 
find that 



Since this inequality is true for all compact subsets K of E^, the proof 
of the weak type inequality for the maximal operator is complete. 

1.2 The Lebesgue differentiation theorem 

The estimate obtained for the maximal function now leads to a solution 
of the averaging problem. 

Theorem 1.3 If f is integrable on then 



for a.e. x. 


Proof. It sulRces to show that for each a > 0 the set 



has measure zero, because this assertion then guarantees that the set 
E = ^i/n has measure zero, and the limit in (4) holds at all points 


ofE<=. 


We fix a, and recall Theorem 2.4 in Chapter 2, which states that for 
each e > 0 we may select a continuous function g of compact support with 
11/ “ As we remarked earlier, the continuity of g implies 

that 



for all X. 


Since we may write the difference Jg f{y) dy — f{x) as 
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we find that 


lim sup 

m(B) ^ 0 
X ^ B 


1 

m{B) 


f{y) dy - f{x) 


IB 


< (/ - gy{x) + \g{x) - /(x)|, 


where the symbol * indicates the maximal function. Consequently, if 

Fa = {x-. {f-g)*{x)>a} and Gq, = {x : \f{x) - g{x)\ > a) 

then Ea C (Fa U Ga), because if and U 2 are positive, then u\ + U 2 > 
2a only if Uj > a for at least one Ui. On the one hand, Tchebychev’s 
inequality yields 

m{Ga) < - 11 / - 

and on the other hand, the weak type estimate for the maximal function 
gives 

m{Fa) < - \\f - g\\Li(R<i)- 

The function g was selected so that ||/ — ffULqRd) < e. Hence we get 

/r. X ^ 1 

BfiiEa) < — e H — e. 
a a 

Since e is arbitrary, we must have m{Ea) = 0, and the proof of the the- 
orem is complete. 

Note that as an immediate consequence of the theorem applied to |/|, 
we see that f*{x) > \f{x)\ for a.e. x, with f* the maximal function. 

We have worked so far under the assumption that / is integrable. This 
“global” assumption is slightly out of place in the context of a “local” 
notion like differentiability. Indeed, the limit in Lebesgue’s theorem is 
taken over balls that shrink to the point x, so the behavior of / far from 
X is irrelevant. Thus, we expect the result to remain valid if we simply 
assume integr ability of / on every ball. 

To make this precise, we say that a measurable function / on 
is locally integrable, if for every ball B the function f{x)xB{x) is 
integrable. We shall denote by the space of all locally integrable 

functions. Loosely speaking, the behavior at infinity does not affect the 
local integrability of a function. For example, the functions and 
are both locally integrable, but not integrable on 
Clearly, the conclusion of the last theorem holds under the weaker 
assumption that / is locally integrable. 
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Theorem 1.4 /// G then 


lim 

m(B) ^ 0 
X ^ B 


1 

m(B) 


[ f{y)dy = f{x), 
JB 


for a.e. x. 


Our first application of this theorem yields an interesting insight into 
the nature of measurable sets. If iii is a measurable set and a; G we 
say that a; is a point of Lebesgue density of E if 


lim 

m(B) 0 
X ^ B 


m{B n E) 
m(B) 


Loosely speaking, this condition says that small balls around x are almost 
entirely covered by E. More precisely, for every a < 1 close to 1, and 
every ball of sufficiently small radius containing x, we have 

m{B r\E) > am{B). 

Thus E covers at least a proportion a of i3. 

An application of Theorem 1.4 to the characteristic function of E im- 
mediately yields the following: 

Corollary 1.5 Suppose E is a measurable subset ofW^. Then: 

(i) Almost every x a E is a point of density of E. 

(ii) Almost every x ^ E is not a point of density of E. 

We next consider a notion that for integrable functions serves as a useful 
substitute for pointwise continuity. 

If / is locally integrable on the Lebesgue set of / consists of all 
points j; G for which f{x) is finite and 

1™ — / \f{y)~fix)\dy = 0. 

At this stage, two simple observations about this definition are in order. 
First, X belongs to the Lebesgue set of / whenever / is continuous at x. 
Second, if x is in the Lebesgue set of /, then 

/ f(y)dy = f{x). 

™(B) ^ 0 m{B) 


Corollary 1.6 If f is locally integrable on R"^, then almost every point 
belongs to the Lebesgue set of f. 
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Proof. An application of Theorem 1.4 to the function \f{y) — r\ shows 
that for each rational r, there exists a set Er of measure zero, such that 

1™ — whenever j; ^ 

li E = then m{E) = 0. Now suppose that x ^ E and f{x) is 

finite. Given e > 0, there exists a rational r such that |/(x) — r| < e. 
Since 

— ^ / \f{y)- f{x)\dy<^^ j \f{y) -r\dy+\f{x) -r\, 
m{B) m{B) Jg 

we must have 


limsup — ^ [ \f{y) - f{x)\dy <2e, 
m[B) Jg 


and thus x is in the Lebesgue set of /. The corollary is therefore proved. 


Remark. Recall from the definition in Section 2 of Chapter 2 that 
elements of are actually equivalence classes, with two functions 

being equivalent if they differ on a set of measure zero. It is interesting 
to observe that the set of points where the averages (4) converge to a 
limit is independent of the representation of / chosen, because 

/ f{y)dy= / g{y)dy 
J B J B 

whenever / and g are equivalent. Nevertheless, the Lebesgue set of / 
depends on the particular representative of / that we consider. 

We shall see that the Lebesgue set of a function enjoys a universal 
property in that at its points the function can be recovered by a wide 
variety of averages. We will prove this both for averages over sets that 
generalize balls, and in the setting of approximations to the identity. 
Note that the theory of differentiation developed so far uses averages 
over balls, but as we mentioned earlier, one could ask whether similar 
conclusions hold for other families of sets, such as cubes or rectangles. 
The answer depends in a fundamental way on the geometric properties 
of the family in question. For example, we now show that in the case of 
cubes (and more generally families of sets with bounded “eccentricity”) 
the above results carry over. However, in the case of the family of all 


108 


Chapter 3. DIFFERENTIATION AND INTEGRATION 


rectangles the existence of the limit almost everywhere and the weak 
type inequality fail (see Problem 8). 

A collection of sets {Ua} is said to shrink regularly to x (or has 
bounded eccentricity at x) if there is a constant c > 0 such that for 
each Ua there is a ball B with 


X a B, Ua C B, and m{Ua) > cm{B). 


Thus Ua is contained in B, but its measure is comparable to the measure 
of B. For example, the set of all open cubes containing x shrink regularly 
to X. However, in with d>2 the collection of all open rectangles 
containing x does not shrink regularly to x. This can be seen if we 
consider very thin rectangles. 

Corollary 1.7 Suppose f is locally integrable on If {Ua} shrinks 
regularly to x, then 



for every point x in the Lebesgue set of f. 

The proof is immediate once we observe that if x Q B with Ua U B 
and m{Ua) > cm{B), then 



2 Good kernels and approximations to the identity 

We shall now turn to averages of functions given as convolutions,^ which 
can be written as 



Here / is a general integrable function, which we keep fixed, while the Ks 
vary over a specific family of functions, referred to as kernels. Expressions 
of this kind arise in many questions (for instance, in the Fourier inversion 
theorem of the previous chapter), and were already discussed in Book L 
In our initial consideration we called these functions “good kernels” if 
they are integrable and satisfy the following conditions for d > 0: 


^Some basic properties of convolutions are described in Exercise 21 of the previous 
chapter. 
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(i) / Ks{x)dx = l. 

Jr‘^ 

(ii) / \Ks{x)\dx < A. 

Jr‘^ 

(iii) For every rj > 0, 



\Ks{x)\ dx 


0 


as 5 — > 0. 


Here H is a constant independent of d. 

The main use of these kernels was that whenever / is bounded, then 
(/ »= Ks){x) — > f{x) as (5 — > 0, at every point of continuity of /. To obtain 
a similar conclusion, one also valid at all points of the Lebesgue set 
of /, we need to strengthen somewhat our assumptions on the kernels 
Ks- To reflect this situation we adopt a different terminology and refer 
to the resulting narrower class of kernels as approximations to the 
identity. The assumptions are again that the Ks are integrable and 
satisfy conditions (i) but, instead of (ii) and (iii), we assume: 

(ii') \Ks{x)\ < A5~‘^ for all 5 > 0. 

(in') \Ks{x)\ < A5/\x\'^^^ for all <5 > 0 and x € 


We observe that these requirements are stronger and imply the conditions 
in the definition of good kernels. Indeed, we first prove (ii). For that, we 
use the second illustration of Corollary 1.10 in Chapter 2, which gives 


( 5 ) 



dx 
I d-kl 



for some C > 0 and all e > 0. 


Then, using the estimates (ii') and (hi') when |a:| < 5 and |a:| > 5, re- 
spectively, yields 


\Ks{x)\dx= / \Ks{x)\dx + 
J\x\<5 

<aJ 

J\x\<5 ^ 


|a:|>5 


\Ks{x)\ dx 


1 


|ai|>(5 


\d-\-l 


dx 


< H' + A" < oo. 


^Sometimes the condition (iik) is replaced by the requirement |it', 5 (x)| < A<5'^/|a:|'*+' 
for some fixed e > 0. However, the special case e = 1 suffices in most circumstances. 
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Finally, the last condition of a good kernel is also verified, since another 
application of (5) gives 



A!b 

V 


and this last expression tends to 0 as 5 —> 0. 

The term “approximation to the identity” originates in the fact that 
the mapping / i— > / * Ks converges to the identity mapping / i— > /, as 
5 — > 0, in various senses, as we shall see below. It is also connected with 
the following heuristics. Figure 2 pictures a typical approximation to the 
identity: for each d > 0, the kernel is supported on the set \x\ < 5 and 
has height 1/2(5. As 5 tends to 0, this family of kernels converges to the 


1/2(5 


-<5 0 <5 


Figure 2. An approximation to the identity 


so-called unit mass at the origin or Dirac delta “function.” The latter 
is heuristically defined by 



and 
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Since each Ks integrates to 1, we may say loosely that 

Ks as (5 — > 0. 

If we think of the convolution f *V as f f{x — y)V{y) dy, the product 
f{x — y)V{y) is 0 except when y = 0, and the mass of V is concentrated 
at y = 0, so we may intuitively expect that 

{f*V){x) = f{x). 

Thus f *V = f, and V plays the role of the identity for convolutions. 
We should mention that this discussion can be formalized and T) given 
a precise definition either in terms of Lebesgue-Stieltjes measures, which 
we take up in Chapter 6, or in terms of “generalized functions” (that is, 
distributions), which we defer to Book IV. 

We now turn to a series of examples of approximations to the identity. 

Example 1. Suppose (/? is a non-negative bounded function in that 
is supported on the unit ball |a;| < 1, and such that 



(/? = 1. 


Then, if we set Ks{x) = 6~'^ip{6~^x), the family {Ks}s>o is an approx- 
imation to the identity. The simple verification is left to the reader. 
Important special cases are in the next two examples. 


Example 2. The Poisson kernel for the upper half-plane is given by 

'Pyix) = - 2 I 2 ’ a; e R, 

TT x^ + y-^ 

where the parameter is now 5 = y > 0. 


Example 3. The heat kernel in M'* is defined by 


Utix) 


t p-|*lV4t 


Here t > 0 and we have <5 = Alternatively, we could set 5 = Ant to 
make the notation consistent with the specific usage in Chapter 2. 


Example 4. The Poisson kernel for the disc is 

1 .1 1 - 


2; AW = 


27r 1 — 2r cos x + 
0 


if l^l < TT, 
if |x| > n. 
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Here we have 0 < r < 1 and 5 = 1 — r. 

Example 5. The Fejer kernel is defined by 

^ r 1 sin\Nx/2) 

—Fn{x) = < 2'kN sw?{x/2) 

[ 0 if 

where 5 = 1/N. 

We note that Examples 2 through 5 have already appeared in Book 1. 

We now turn to a general result about approximations to the identity 
that highlights the role of the Lebesgue set. 

Theorem 2.1 If {Ks}s>o is an approximation to the identity and f is 
integrable on then 

{f * Ks){x) ^ f{x) as 6^0 

for every x in the Lebesgue set of f. In particular, the limit holds for 

a.e. X. 


x\ < TT, 
x\ > TT, 


Since the integral of each kernel Kg is equal to 1, we may write 


(/ * Ks){x) - f{x) = J [f{x -y) - f{x)] Ks{y) dy. 


Consequently, 

|(/ * Ks){x) - f{x)\ < J \f{x -y)- f{x)\ \Ks{y)\ dy, 

and it now suffices to prove that the right-hand side tends to 0 as 5 goes 
to 0. The argument we give depends on a simple result that we isolate 
in the next lemma. 


Lemma 2.2 Suppose that f is integrable on and that x is a point of 
the Lebesgue set of f. Let 


Ar) = ^f \f{x-y)- 

X J\y\<r 

Then A{r) is a continuous function 


f{x)\dy, whenever r > 0. 
of r > 0, and 


.4(r) ^0 as r — > 0. 

Moreover, A{r) is bounded, that is, A{r) < M for some M > 0 and all 
r > 0. 
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Proof. The continuity of A{r) follows by invoking the absolute conti- 
nuity in Proposition 1.12 of Chapter 2. 

The fact that A{r) tends to 0 as r tends to 0 follows since x belongs 
to the Lebesgue set of /, and the measure of a ball of radius r is Vd.r'^. 
This and the continuity of A{r) for 0 < r < 1 show that A{r) is bounded 
when 0 < r < 1. To prove that A{r) is bounded for r > 1, note that 

A{r)<^f \f{x-y)\dy+ ^ [ \f{x)\dy 

'' J\y\<r '' ^\y\<r 

and this concludes the proof of the lemma. 

We now return to the proof of the theorem. The key consists in writing 
the integral over as a sum of integrals over annuli as follows: 

« OO 

\f{x-y)-f{x)\\Ks{y)\dy= ^ 

Py\<^ k=o 

By using the property (ii') of the approximation to the identity, the first 
term is estimated by 


/2'=(5<|y|<2'=+i5 



y) - f{x)\\Ks{y)\dy < ^ f \f{x 

° J\y\<s 
< cA{d). 


y) - f{x)\ dy 


Each term in the sum is estimated similarly, but this time by using 
property (iii') of approximations to the identity: 


l2>=S<\y\<2>‘+^S 


\f{x 


y) - f{x)\ \Ks{y)\ dy 

< c'2-'^A{2'^+^5). 


Putting these estimates together, we find that 

CXD 

|(/ * Ks){x) - f{x)\ < cA{5) + PY. 

k=0 
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Given e > 0, we first choose N so large that X]/c>Ar 2 ^ < e. Then, by 
making 5 sufficiently small, we have by the lemma 

A{2^5) < e/N, whenever A: = 0, 1, . . . , — 1. 

Hence, recalling that A{r) is bounded, we find 

\{f*Ks){x) - fix)\ < Ce 

for all sufficiently small d, and the theorem is proved. 

In addition to this pointwise result, convolutions with approximations 
to the identity also provide convergence in the L^-norm. 

Theorem 2.3 Suppose that f is integrable on and that {Kg}syQ is 
an approximation to the identity. Then, for each 5 > 0, the convolution 

{f*Kg){x)=l f{x-y)Ks{y)dy 

is integrable, and 

\\{f * Ks) - ^ 0, as 5^0. 

The proof is merely a repetition in a more general context of the argument 
in the special case where Ks{x) = given in Section 4*, 

Chapter 2, and so will not be repeated. 

3 Differentiability of functions 

We now take up the second question raised at the beginning of this 
chapter, that of finding a broad condition on functions F that guarantees 
the identity 

(6) F{b)-F{a)= f F'{x)dx. 

J a 

There are two phenomena that make a general formulation of this identity 
problematic. First, because of the existence of non-differentiable func- 
tions,^ the right-hand side of (6) might not be meaningful if we merely 
assumed F was continuous. Second, even if F'{x) existed for every x, 
the function F' would not necessarily be (Lebesgue) integrable. (See 
Exercise 12.) 


^In particular, there are continuous nowhere differentiable functions. See Chapter 4 in 
Book I, or also Chapter 7 below. 
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How do we deal with these difficulties? One way is by limiting ourselves 
to those functions F that arise as indefinite integrals (of integrable func- 
tions). This raises the issue of how to characterize such functions, and 
we approach that problem via the study of a wider class, the functions 
of bounded variation. These functions are closely related to the question 
of rectifiability of curves, and we start by considering this connection. 

3.1 Functions of bounded variation 

Let 7 be a parametrized curve in the plane given by z{t) = {x{t),y{t)), 
where a < t < b. Here x{t) and y{t) are continuous real-valued functions 
on [a, b]. The curve 7 is rectifiable if there exists M < 00 such that, for 
any partition a = to < ti < ■ ■ ■ < tjsi = b of [a,b], 

N 

(7) -2(ti-i)| < M. 

i=i 

By definition, the length ^(y) of the curve is the supremum over all 
partitions of the sum on the left-hand side, that is, 

N 

L{l) = sup ^ \z{tj) - z{tj-i)\. 

a=to<ti<--<tN=b 

Alternatively, L{'y) is the infimum of all M that satisfy (7). Geomet- 
rically, the quantity L{j) is obtained by approximating the curve by 
polygonal lines and taking the limit of the length of these polygonal 
lines as the interval [a, b] is partitioned more finely (see the illustration 
in Figure 3). 

Naturally, we may now ask the following questions: What analytic 
condition on x{t) and y{t) guarantees rectifiability of the curve 7? In 
particular, must the derivatives of x{t) and y{t) exist? If so, does one 
have the desired formula 

L(7)= [\x'{tf + y'{t)Y^^dt? 

J a 

The answer to the first question leads directly to the class of functions 
of bounded variation, a class that plays a key role in the theory of dif- 
ferentiation. 

Suppose F{t) is a complex- valued function defined on [a, 5], and a = 
to < ti <■■■< t]\f = b is a, partition of this interval. The variation of F 
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Figure 3. Approximation of a rectifiable curve by polygonal lines 


on this partition is defined by 

N 

Y^\F{tj) - 
j=i 

The function F is said to be of bounded variation if the variations of 
F over all partitions are bounded, that is, there exists M < oo so that 

N 

< M 

1=1 

for all partitions a = to < t-i < ■ ■ ■ < tpf = b. In this definition we do not 
assume that F is continuous; however, when applying it to the case of 
curves, we will suppose that F{t) = z{t) = x{t) + iy{t) is continuous. 

We observe that if a partition V given by a = to < < ' ' ' < ^ is 

a refinement® of a partition V given by a = to < < ' ' ' < then 

the variation of F on P is greater than or equal to the variation of F on 
V. 

Theorem 3.1 A curve parametrized by {x{t),y{t)), a < t < b, is rectifi- 
able if and only if both x{t) and y{t) are of bounded variation. 

The proof is immediate once we observe that if F{t) = x{t) + iy{t), then 

F{tj) - F{tj-i) = {x{tj) - x{tj-i)) + i {y{tj) - y{tj-i )) , 


^We say that a partition V of [a, b] is a refinement of a partition V of [a, b] if every 
point in V also belongs to P. 
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and if a and h are real, then |a+ ib\ < \a\ + |6| < 2\a + ih\. 

Intuitively, a function of bounded variation cannot oscillate too often 
with amplitudes that are too large. Some examples should help clarify 
this assertion. 

We first fix some terminology. A real-valued function F defined on 
[a, 6] is increasing if F{ti) < F{t 2 ) whenever a < t\ < t 2 <b. If the 
inequality is strict, we say that F is strictly increasing. 

Example 1. If E is real- valued, monotonic, and bounded, then F is of 
bounded variation. Indeed, if for example F is increasing and bounded 
by M, we see that 

N N 

Y, \nh) - 

j=i i=i 

= F{b) - F{a) < 2M. 


Example 2. If F is differentiable at every point, and F' is bounded, 
then F is of bounded variation. Indeed, if |E'| < M, the mean value 
theorem implies 

\F{x) - F{y)\ < M\x - y\, for all x,y e [a,b], 
hence < M{b — a). (See also Exercise 23.) 

Example 3. Let 

. f sin(a:“*') for 0 < x < 1, 

0 ifx = 0. 

Then F is of bounded variation on [0, 1] if and only if a > 6 (Exercise 11). 
Figure 4 illustrates the three cases a > b, a = b, and a < b. 

The next result shows that in some sense the first example above ex- 
hausts all functions of bounded variation. For its proof, we need the fol- 
lowing definitions. The total variation of / on [a, x] (where a < x <b) 
is defined by 


N 

Tpia.x) = sup Y \F{tj) - E(tj_i)|, 
i=i 
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a = 2,b=l 





Figure 4. Graphs of sin(a; for different values of a and b 


where the sup is over all partitions of [a, x]. The preceding definition 
makes sense if F is complex-valued. The succeeding ones require that 
F is real-valued. In the spirit of the first definition, we say that the 
positive variation of F on [a,a;] is 

PF{a,x) = sup '^F{tj) - F{tj-i), 

(+) 

where the sum is over all j such that F{tj) > F{tj-i), and the supremum 
is over all partitions of [a,x]. Finally, the negative variation of F on 
[a, x\ is defined by 

NF{a,x) = swp '^-[F{tj) - F{tj--{)], 

(-) 

where the sum is over all j such that F[tj) < F{tj-i), and the supremum 
is over all partitions of [a,x]. 

Lemma 3.2 Suppose F is real-valued and of bounded variation on [a, 6]. 
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Then for all a < x < b one has 

F{x) — F{a) = Ppia, x) — Np{a, x), 


and 


TF{a, x) = Ppia, x) + Np{a, x). 

Proof. Given e > 0 there exists a partition a = to < ■ ■ ■ < = x of 

[a,x], such that 


M 

1 

1 

< e and 

Np-Y-[F{t,)-F{t,.,)] 

(+) 


(-) 


(To see this, it suffices to use the definition to obtain similar estimates 
for Pp and Np with possibly different partitions, and then to consider a 
common refinement of these two partitions.) Since we also note that 

F{x) - F{a) = - E 

(+) (-) 

we find that |T(a;) — F{a) — [Pp — iV^]| < 2e, which proves the first iden- 
tity. 

For the second identity, we also note that for any partition of a = to < 
■ ■ ■ < In = X of [a,x] we have 

N 

- F{tj.^)\ = J2F{tj) - F{t,.,) + -[Fitj) - F{t,.,)], 
1=1 (+) (-) 

hence Tp < Pp + Np. Also, the above implies 

^F{t,) - F{t,.^) + -[F{t,) - F(t,_i)] < Tp. 

(+) (-) 

Once again, one can argue using common refinements of partitions in the 
definitions of Pp and Np to deduce the inequality Pp + Np < Tp, and 
the lemma is proved. 


Theorem 3.3 A real-valued function F on [a, b] is of bounded variation 
if and only if F is the difference of two increasing bounded functions. 
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Proof. Clearly, ii F = Fi — F 2 , where each Fj is bounded and in- 
creasing, then F is of bounded variation. 

Conversely, suppose F is of bounded variation. Then, we let Fi{x) = 
PF{a,x) + F{a) and F 2 {x) = NF{a,x). Clearly, both Fi and F 2 are in- 
creasing, of bounded variation, and by the lemma F{x) = Fi{x) — F 2 {x). 


Observe that as a consequence, a complex- valued function of bounded 
variation is a (complex) linear combination of four increasing functions. 

Returning to the curve 7 parametrized by a continuous function z{t) = 
x{t) -|- iy{t), we want to make some comment about its associated length 
function. Assuming that the curve is rectifiable, we define L{A,B) as the 
length of the segment of 7 that arises as the image of those t for which 
A < t < B, with a < A < B < b. Note that L{A, B) = Tf{A, B), where 
F{t) = z(t). We see that 

( 8 ) L{A,C) + L{C,B) = L{A,B) AA<C<B. 

We also observe that L{A, B) is a continuous function of B (and of 
A). Since it is an increasing function, to prove its continuity in B from 
the left, it suffices to see that for each B and e > 0, we can find Bi < B 
such that L{A, Bi) > L{A, B) — e. We do this by first finding a partition 
A = to < ti < ■ ■ ■ < tN = B such that the length of the corresponding 
polygonal line is > L{A, B) — e/2. By continuity of the function z{t), 
we can find a Bi, with tN-i < Bi < B, such that \z{B) — z{Bi)\ < e/2. 
Now for the refined partition to < ti <■■■ < tw-i < Bi < B, the length 
of the polygonal line is still >L{A,B) — e/2. Therefore, the length 
for the partition to < h <■■■ < t^-i = Bi is > L{A, B) — e, and thus 
L{A,Bi) > L{A,B) - e. 

To prove continuity from the right at B, let e > 0, pick any C > B, 
and choose a partition B = to < ti < ■ ■ ■ < t^ = C such that L{B, C) — 
e /2 < 'Y^^=o k(^i+i) “ considering a refinement of this par- 

tition if necessary, we may assume since z is continuous that \z{ti) — 
z{to)\ < e/2. If we denote Bi = zit-f), then we get 

L{B, C) - e/2 < e/2 + L{Bi,C). 

Since L{B,Bi) -|- L{Bi,C) = L{B,C) we have L{B,Bi) < e, and there- 
fore L{A, Bi) - L{A, B) < e. 

Note that what we have observed can be re-stated as follows: if a 
function of bounded variation is continuous, then so is its total variation. 

The next result lies at the heart of the theory of differentiation. 
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Theorem 3.4 If F is of bounded variation on [a,b\, then F is differen- 
tiable almost everywhere. 

In other words, the quotient 


lim 

h~^0 


F(x + h) - F{x) 
h 


exists for almost every x € [a, 6], By the previous result, it suffices to 
consider the case when F is increasing. In fact, we shall first also assume 
that F is continuous. This makes the argument simpler. As for the 
general case, we leave that till later. (See Section 3.3.) It will then 
be instructive to examine the nature of the possible discontinuities of a 
function of bounded variation, and reduce matters to the case of “jump 
functions.” 

We begin with a nice technical lemma of F. Riesz, which has the effect 
of a covering argument. 

Lemma 3.5 Suppose G is real-valued and eontinuous on M. Let E be 
the set of points x sueh that 


G{x + h) > G{x) for some h = h^ > 0. 


If E is non-empty, then it must be open, and henee ean be written as a 
countable disjoint union of open intervals E = \ff{ak,hk). If (akffk) is a 
finite interval in this union, then 


G{bk) - G{ak) = 0. 

Proof. Since G is continuous, it is clear that E is open whenever it is 
non-empty and can therefore be written as a disjoint union of countably 
many open intervals (Theorem 1.3 in Chapter 1). If {ak,bk) denotes a 
finite interval in this decomposition, then Ofc ^ E] therefore we cannot 
have G{bk) > G{ak). We now suppose that G{bk) < G{ak). By continu- 
ity, there exists ak < c < bk so that 

_ G{ak) + G{bk) 

G{c) - ^ , 

and in fact we may choose c farthest to the right in the interval {ak,bk). 
Since c ^ E, there exists d> c such that G{d) > G{c). Since bk ^ E, we 
must have G{x) < G{bk) for all x > bk', therefore d < bk. Since G{d) > 
G{c), there exists (by continuity) c' > d with c' < bk and G(c') = G{c), 
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which contradicts the fact that c was chosen farthest to the right in 
{ak,bk)- This shows that we must have G{ak) = G{bk), and the lemma 
is proved. 

Note. This result sometimes carries the name “rising sun lemma” for 
the following reason. If one thinks of the sun rising from the east (at 
the right) with the rays of light parallel to the x-axis, then the points 
{x, G{x)) on the graph of G, with x a E, are precisely the points which 
are in the shade; these points appear in bold in Figure 5. 



Figure 5. Rising sun lemma 


A slight modification of the proof of Lemma 3.5 gives: 


Corollary 3.6 Suppose G is real-valued and eontinuous on a elosed in- 
terval [a, b]. If E denotes the set of points x in {a, b) so that G{x + h) > 
G{x) for some h > 0, then E is either empty or open. In the latter 
case, it is a disjoint union of countably many intervals {ak,bk), and 
G{ak) = G{bk), except possibly when a = ak, in which case we only have 


G{ak) < G{bk). 


For the proof of the theorem, we define the quantity 
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We also consider the four Dini numbers at x defined by 
Zl+(F)(x) = limsup Ah{F){x) 

h 0 
h > 0 

Z)+(F)(x) = liminf Ah(F)(x) 

h 0 
h > 0 

n~(F)(j;) =limsup Ah{F){x) 

h —<■ 0 
h < 0 

=liminf Ah{F){x). 

h — ► 0 
h, < 0 

Clearly, one has and Z)_ < D~ . To prove the theorem it 

suffices to show that 

(i) D^{F){x) < oo for a.e. x, and; 

(ii) D^{F){x) < D-{F){x) for a.e. x. 

Indeed, if these results hold, then by applying (ii) to —F{—x) instead of 
F{x) we obtain D~{F){x) < D^{F){x) for a.e. x. Therefore 

< D- < D~ < < oo for a.e. x. 

Thus all four Dini numbers are finite and equal almost everywhere, hence 
F'(x) exists for almost every point x. 

We recall that we assume that F is increasing, bounded, and continu- 
ous on [a, b]. For a fixed 7 > 0, let 

E^ = {x: D+{F){x) > 7}. 

First, we assert that is measurable. (The proof of this simple fact is 
outlined in Exercise 14.) Next, we apply Corollary 3.6 to the function 
G{x) = E{x) — 7 X, and note that we then have Ej C [jf,{ak,bk), where 
F{bk) - E{ak) > jibk - ak). Consequently, 

m{E^) < E m{{ak,bk)) 

k 

< - ^F{bk)-E{ak) 

k 

<\E{b)-E{a)). 

7 

Therefore m{E^) — > 0 as 7 tends to infinity, and since {D^E{x) < 00} C 
E^ for all 7 , this proves that D^E{x) < 00 almost everywhere. 
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Having fixed real numbers r and R such that i? > r, we let 

E = {x e[a,b] : D+{F){x) > R and r>D-{F){x)}. 

We will have shown D^{F){x) < D-[F){x) almost everywhere once we 
prove that m{E) = 0, since it then suffices to let R and r vary over the 
rationals with R > r. 

To prove that m{E) = 0 we may assume that m{E) > 0 and arrive at 
a contradiction. Because i?/r > 1 we can find an open set O such that 
E C O C. {a,b), yet m{0) < m{E) ■ R/r. 

Now O can be written as |J/n, with disjoint open intervals. Fix 
n and apply Corollary 3.6 to the function G{x) = —F{—x) + rx on the 
interval —In- Reflecting through the origin again yields an open set 
^k) contained in where the intervals (ofe, bk) are disjoint, with 

F{bk) - F{ak) < r{bk - Ok)- 

However, on each interval {ak,bk) we apply Corollary 3.6, this time to 
G{x) = F{x) — Rx. We thus obtain an open set On = Ufe j i^k,j,bk,j) of 
disjoint open intervals {akj,bkj) with {akj,bkj) C {ak,bk) for every j, 
and 


^ R(.bkJ 

Then using the fact that F is increasing we find that 

kkli^On) ^ ^ ^ ^ Fjbk^j ) F(^Clk,j') 

k,j k,j 

< X] ^ ~ °'k) 

k k 

Note that On D Fin /n, since D^F{x) > R and r > D^F{x) for each 
a: e FI; of course, F„ D On- We now sum in n. Therefore 

m{E) = '^m{EnIn) < '^m{On) < ^'^'m{In) = ^m{0) < m{E). 

n n 

The strict inequality gives a contradiction and Theorem 3.4 is proved, at 
least when F is continuous. 

Let us see how far we have come regarding (6) if F is a monotonic 
function. 
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Corollary 3.7 If F is increasing and continuous, then F' exists almost 
everywhere. Moreover F' is measurable, non-negative, and 

f F'{x)dx < F{b) - F{a). 

J a 

In particular, if F is bounded on M, then F' is integrable on R. 

Proof. For n > 1, we consider the quotient 

F{x+l/n) - F{x) 

G„W = ^ . 

By the previous theorem, we have that Gn{x) — > F'{x) for a.e. x, which 
shows in particular that F' is measurable and non-negative. 

We now extend F as a continuous function on all of M. By Fatou’s 
lemma (Lemma 1.7 in Chapter 2) we know that 

pb pb 

/ F'(a;) dx < liminf / Gn{x)dx. 

I n^oo / 

o a J a 

To complete the proof, it suffices to note that 

[ Gn{x)dx=-^ [ F{x 1/n) dx - [ F{x) 

Ja Ja Ja 


dx 


f.b+1/ 




dx 


-+ 1 / 
6+1/n 


^ po-Fi/n ^ ^a+l/n 

= — — / F(x)dx——~ / F{x)dx. 

In, 1 n 


Since F is continuous, the first and second terms converge to F{b) and 
F{a), respectively, as n goes to infinity, so the proof of the corollary is 
complete. 


We cannot go any farther than the inequality in the corollary if we 
allow all continuous increasing functions, as is shown by the following 
important example. 


The Cantor-Lebesgue function 

The following simple construction yields a continuous function F : [0, 1] — i 
[0,1] that is increasing with F(0) = 0 and F(l) = 1, but F'{x) = 0 al- 
most everywhere! Hence F is of bounded variation, but 

F'{x)dx y F{b) - F{a). 
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Consider the standard triadic Cantor set C C [0, 1] described at the 
end of Section 1 in Chapter 1, and recall that 

OO 

C = f| Cfe, 

k=0 


where each Ck is a disjoint union of 2^ closed intervals. For example, 
Cl = [0, 1/3] U [2/3, 1]. Let Fi{x) be the continuous increasing function 
on [0, 1] that satisfies Fi(0) = 0, Fi{x) = 1/2 if 1/3 < a; < 2/3, Fi(l) = 1, 
and Fi is linear on Ci. Similarly, let F 2 {x) be continuous and increasing, 
and such that 


F2{x) 


0 if a; = 0, 

1/4 if 1/9 < a; < 2/9, 
< 1/2 if 1/3 < a; < 2/3, 

3/4 if 7/9 < a; < 8/9, 

1 if a; = 1, 


and F 2 is linear on C 2 - See Figure 6. 



Figure 6. Construction of F 2 


This process yields a sequence of continuous increasing functions 
such that clearly 

\Fr,+i{x)-F^{x)\<2-^-\ 

Hence {Fn}'^=i converges uniformly to a continuous limit F called the 
Cantor-Lebesgue function (Figure 7).® By construction, F is increas- 
ing, F{0) = 0, F{1) = 1, and we see that F is constant on each interval 
of the complement of the Cantor set. Since m(C) = 0, we find that 
F'ix) = 0 almost everywhere, as desired. 


®The reader may check that indeed this function agrees with the one given in Exercise 2 
of Chapter 1. 
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Figure 7. The Cantor-Lebesgue function 


The considerations in this section, as well as this last example, show 
that the assumption of bounded variation guarantees the existence of a 
derivative almost everywhere, but not the validity of the formula 

F'{x)dx = F{h) - F{a). 

In the next section, we shall present a condition on a function that will 
completely settle the problem of establishing the above identity. 

3.2 Absolutely continuous functions 

A function F defined on [a, b] is absolutely continuous if for any e > 0 
there exists <5 > 0 so that 

N N 

'^\F{bk) - F{ak)\ < e whenever '^{bk - Ok) < 5, 

k=l k=l 

and the intervals {ak,bk), k = 1, . . . , N are disjoint. Some general re- 
marks are in order. 

• From the definition, it is clear that absolutely continuous functions 
are continuous, and in fact uniformly continuous. 

• If F is absolutely continuous on a bounded interval, then it is also of 
bounded variation on the same interval. Moreover, as is easily seen, 
its total variation is continuous (in fact absolutely continuous). As 
a consequence the decomposition of such a function F into two 



128 


Chapter 3. DIFFERENTIATION AND INTEGRATION 


monotonic functions given in Section 3.1 shows that each of these 
functions is continuous. 

• If F{x) = f{y) dy where / is integrable, then F is absolutely 
continuous. This follows at once from (ii) in Proposition 1.12, 
Chapter 2. 

In fact, this last remark shows that absolute continuity is a necessary 
condition to impose on F if we hope to prove F'{x) dx = F{b) — F{a). 

Theorem 3.8 If F is absolutely continuous on [a,b], then F'(x) exists 
almost everywhere. Moreover, if F'{x) = 0 for a.e. x, then F is constant. 

Since an absolutely continuous function is the difference of two continu- 
ous monotonic functions, as we have seen above, the existence of F'ix) 
for a.e. x follows from what we have already proved. To prove that 
F'ix) = 0 a.e. implies F is constant requires a more elaborate version of 
the covering argument in Lemma 1.2. For the moment we revert to the 
generality of d dimensions to describe this. 

A collection B of balls {B} is said to be a Vital! covering of a set E 
if for every x a E and any y > 0 there is a ball B ^ B, such that x & B 
and m{B) < rj. Thus every point is covered by balls of arbitrarily small 
measure. 

Lemma 3.9 Suppose E is a set of finite measure and B is a Vitali cov- 
ering of E. For any 5 > Q we can find finitely many balls Bi, . . . , B^ in 
B that are disjoint and so that 

N 

^^m{Bi) > m{E) — 6. 

i=l 

Proof. We apply the elementary Lemma 1.2 iteratively, with the 
aim of exhausting the set E. It suffices to take 5 sufficiently small, say 
5 < m{E), and using the just cited covering lemma, we can find an initial 
collection of disjoint balls Bi, B 2 , . . . , Bn^ in B such that ^ 

yd. (For simplicity of notation, we have written 7 = 3~'^.) Indeed, first 
we have m{E') > 6 for an appropriate compact subset E' of E. Because 
of the compactness of E' , we can cover it by finitely many balls from B, 
and then the previous lemma allows us to select a disjoint sub-collection 
of these balls Bi,B 2 , . . . , BjVj such that "m-iBi) > ^m{E') > yd. 

With Bi, . . . ,Bm^ as our initial sequence of balls, we consider two 
possibilities: either > m{E) — d and we are done with N = 
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or, contrariwise, < m{E) — 5. In the second case, with 

E 2 = E — Bi, we have m{E 2 ) > S (recall that m{Bi) = m{Bi)). We 
then repeat the previous argument, by choosing a compact subset E '2 of 
E 2 with m{E' 2 ) > 5, and by noting that the balls in B that are disjoint 
from cover E 2 and in fact give a Vitali covering for E 2 , and 

hence for E 2 . Thus we can choose a finite disjoint collection of these 
balls Bi, Ni <i < N 2 , so that X]Ari<i< 7 V 2 ^ Therefore, now 

'^f=i > 27 ( 1 , and the balls Bi, I <i < N2, are disjoint. 

We again consider two alternatives, whether or not rn{Bi) > 

m{E) — 5. In the first case, we are done with N 2 = N, and in the second 
case, we proceed as before. If, continuing this way, we had reached the 
stage and not stopped before then, we would have selected a collection 
of disjoint balls with the sum of their measures > kjS. In any case, our 
process achieves the desired goal by the stage if fc > {m{E) — 6)/^5, 
since in this case rn{Bi) > m{E) — 5. 

A simple consequence is the following. 

Corollary 3.10 We can arrange the choice of the balls so that 

N 

m{E — Bi) < 25. 

i=l 

In fact, let O be an open set, with (D D E and m{0 — E) < 5. Since 
we are dealing with a Vitali covering of E, we can restrict all of our 
choices above to balls contained in O. If we do this, then {E — U^i ^i) U 
Bi C O, where the union on the left-hand side is a disjoint union. 
Hence 


N N 

m{E — U Bi) < m{0) - m(|J Bi) < m{E) + 5- {m{E) - 5) = 25. 
2=1 2=1 


We now return to the situation on the real line. To complete the proof 
of the theorem it suffices to show that under its hypotheses we have 
E{b) = E{a), since if that is proved, we can replace the interval [a, 6] by 
any sub-interval. Now let E be the set of those x € {a,b) where E'{x) 
exists and is zero. By our assumption m{E) = b — a. Next, momentarily 
fix e > 0. Since for each x ^ E we have 


F{x + h) - E{x) 


lim 


h 
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then for each r] > 0 we have an open interval / = {ax,bx) C [a, 6] con- 
taining X, with 

\F{bx) - F{ax)\ < e{bx - ax) and bx - ax < rj- 

The collection of these intervals forms a Vital! covering of E, and 
hence by the lemma, for <5 > 0, we can select finitely many li, 1 < i < N , 
li = (oi, bi), which are disjoint and such that 

N 

(9) E m{Ii) > m{E) — 5 = {b — a) — 5. 

i=l 

However, \F{bi) — F{ai)\ < e{bi — ai), and upon adding these inequalities 
we get 

N 

Y,\F{b^)-F{a.)\<e{b-a), 

i=l 

since the intervals li are disjoint and lie in [a,b]. Next consider the 
complement of ™ [a,b]. It consists of finitely many closed in- 
tervals /^fe] total length < S because of (9). Thus by the 

absolute continuity of F (if b is chosen appropriately in terms of e), 
I2kLi lF(/3k) - F{ak)\ < e. Altogether, then, 

N M 

\F{b) - F{a)\ < Y, \F{h) - F{a.)\ + ^ \F{(5k) - F{ak)\ < e{b - a) + e. 

i=l k=l 

Since e was positive but otherwise arbitrary, we conclude that F{b) — 
F{a) = 0, which we set out to show. 

The culmination of all our efforts is contained in the next theorem. In 
particular, it resolves our second problem of establishing the reciprocity 
between differentiation and integration. 

Theorem 3.11 Suppose F is absolutely continuous on [a,b\. Then F' 
exists almost everywhere and is integrable. Moreover, 

F{x) — F{a) = [ F'{y) dy, for all a < x < b. 

J a 

By selecting x = b we get F{b) — F{a) = F'{y) dy. 

Conversely, if f is integrable on [a,b], then there exists an absolutely 
continuous function F such that F'{x) = f{x) almost everywhere, and in 
fact, we may take F{x) = la 
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Proof. Since we know that a real-valued absolutely continuous 
function is the difference of two continuous increasing functions, Corol- 
lary 3.7 shows that F' is integrable on [a, b]. Now let G{x) = F'{y) dy. 

Then G is absolutely continuous; hence so is the difference G{x) — F{x). 
By the Lebesgue differentiation theorem (Theorem 1.4), we know that 
G'{x) = F'{x) for a.e. x\ hence the difference F — G has derivative 0 al- 
most everywhere. By the previous theorem we conclude that F — G is 
constant, and evaluating this expression at x = a gives the desired result. 

The converse is a consequence of the observation we made earlier, 
namely that f{y) dy is absolutely continuous, and the Lebesgue dif- 
ferentiation theorem, which gives F'{x) = f{x) almost everywhere. 


3.3 Differentiability of jump functions 

We now examine monotonic functions that are not assumed to be con- 
tinuous. The resulting analysis will allow us to remove the continuity 
assumption made earlier in the proof of Theorem 3.4. 

As before, we may assume that F is increasing and bounded. In par- 
ticular, these two conditions guarantee that the limits 

F{x~) = lim F{y) and F{x'^) = lim F{y) 

y —* X y —* X 

y < X y > X 

exist. Then of course F{x~) < F{x) < T(x+), and the function F is 
continuous at x if F{x~) = T(x+); otherwise, we say that it has a jump 
discontinuity. Fortunately, dealing with these discontinuities is manage- 
able, since there can only be countably many of them. 

Lemma 3.12 A bounded increasing function F on [a,b] has at most 
countably many discontinuities. 

Proof. If F is discontinuous at x, we may choose a rational number 
rx so that F{x~) < Vx < F(x+). If / is discontinuous at x and z with 
X < z, we must have F{x'^) < F{z~), hence rx < rz. Consequently, to 
each rational number corresponds at most one discontinuity of F, hence 
F can have at most a countable number of discontinuities. 

Now let denote the points where F is discontinuous, and let 

Oin denote the jump of F at x„, that is, = F{xf) — F{x~). Then 

= F{xf) + an 


and 


Fi^Xff) ^^Xn ) T 


for some On, with 0 < < 1. 
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If we let 

{ 0 if X < Xji, 

On if X = Xn, 

1 if X > Xji, 

then we define the jump function associated to F by 

OO 

Jf(x) = y^^anjnjx). 

n=l 

For simplicity, and when no confusion is possible, we shall write J instead 
oi Jp. 

Our first observation is that if F is bounded, then we must have 

OO 

^ Un < F{b) - F{a) < OO, 

n=l 


and hence the series defining J converges absolutely and uniformly. 
Lemma 3.13 If F is increasing and bounded on [a,b], then: 

(i) J(x) is diseontinuous preeisely at the points {xn} and has a jump 
at Xn equal to that of F. 

(ii) The differenee F{x) — J{x) is increasing and continuous. 

Proof. If X 7^ Xn for all n, each jn is continuous at x, and since the 
series converges uniformly, J must be continuous at x. If x = xjv for 
some A^, then we write 


N OO 

J{x) = y^^anjnjx) + anjn{x). 

n=l n=N-\-l 


By the same argument as above, the series on the right-hand side is 
continuous at x. Clearly, the finite sum has a jump discontinuity at xat 
of size uat. 

For (ii), we note that (i) implies at once that F — J is continuous. 
Finally, ii y > x we have 

J{y)-J{x)< ^ an < F{y) - F{x), 

X<Xn<y 
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where the last inequality follows since F is increasing. Hence 


F{x) - J{x) < F{y) - J{y) 


and the difference F — J is increasing, as desired. 

Since we may write F{x) = [F(a;) — J{x)] + J{x), our final task is to 
prove that J is differentiable almost everywhere. 

Theorem 3.14 If J is the jump function considered above, then J'{x) 
exists and vanishes almost everywhere. 

Proof. Given any e > 0, we note that the set E of those x where 


J{x + h) — J{x) 
h 


( 10 ) 


lim sup 
h—*0 


> e 


is a measurable set. (The proof of this little fact is outlined in Exercise 14 


below.) Suppose S = m{E). We need to show that <5 = 0. Now observe 


that since the series y) an arising in the definition of J converges, then for 
any rj, to be chosen later, we can find an N so large that X)n>Ar ^ ''?• 
We then write 



n>N 


and because of our choice of N we have 


( 11 ) 


Jo(6) - Jo(a) < rj. 


However, J — Jg is a finite sum of terms anjn{x), and therefore the set 
of points where (10) holds, with J replaced by Jg, differs from E by 
at most a finite set, the points {x\,X 2 , . . . ,xn}- Thus we can find a 
compact set K, with m{K) > <5/2, so that limsup^^^g -^oix+hj-Joix) ^ ^ 
for each x a K. Hence there are intervals containing x, x (i K, so 

that Jo{bx) — Jo{ax) > e{bx — Ux). We can first choose a finite collection 
of these intervals that covers K, and then apply Lemma 1.2 to select 
intervals Ii, I 2 , . . . , In which are disjoint, and for which — 

m{K)/3. The intervals Ij = {oj, bj) of course satisfy 


Jo{bj) - Jo{aj) > e{bj - aj). 


Now, 


N 


Mb) - Ma) > ^ Jo{bj) - Mo-j) > e “ «i) ^ > ^5. 
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Thus by (11), e5/& < r], and since we are free to choose -q, it follows that 
5 = 0 and the theorem is proved. 

4 Rectifiable curves and the isoperimetric inequality 

We turn to the further study of rectifiable curves and take up first the 
validity of the formula 



( 12 ) 


for the length L of the curve parametrized by {x{t),y{t)). 

We have already seen that rectifiable curves are precisely the curves 
where, besides the assumed continuity of x{t) and y{t), these functions 


are of bounded variation. However a simple example shows that for- 


mula (12) does not always hold in this context. Indeed, let x{t) = F{t) 
and y{t) = F{t), where F is the Cantor-Lebesgue function and 0 < t < 1. 
Then this parametrized curve traces out the straight line from (0, 0) to 
(1, 1) and has length v^, yet x'{t) = y'{t) = 0 for a.e. t. 

The integral formula expressing the length of L is in fact valid if we 
assume in addition that the coordinate functions of the parametrization 
are absolutely continuous. 

Theorem 4.1 Suppose {x{t),y{t)) is a curve defined for a<t<h. If 
both x{t) and y{t) are absolutely continuous, then the curve is rectifiable, 
and if L denotes its length, we have 



Note that if F{t) = x{t) + iy{t) is absolutely continuous then it is auto- 
matically of bounded variation, and hence the curve is rectifiable. The 
identity (12) is an immediate consequence of the proposition below, which 
can be viewed as a more precise version of Corollary 3.7 for absolutely 
continuous functions. 


Proposition 4.2 Suppose F is complex-valued and absolutely continu- 
ous on [a, b] . Then 
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In fact, because of Theorem 3.11, for any partition a = to < ti < ■ ■ ■ < 
tN = b of [a, 6], we have 


N 

i=i 

< 


N 

E 

7 = 1 "''^7-1 

±f 

j=l Jij-i 


F'{t) dt 
F'{t)\dt 


\F'{t)\dt. 


So this proves 


(13) TF{a,b)< j \F'{t)\dt. 

J a 


To prove the reverse inequality, fix e > 0, and using Theorem 2.4 in 
Chapter 2 find a step function g on [a, 6], such that F' = g + h with 
\ h{t) \ dt < e. Set G{x) = g{t) dt, and F[{x) = h{t) dt. Then F = 
G + H, and as is easily seen 


Tpia, b) > Tc{a, b) - Tnia, b). 

However, by (13) Tnia, b) < e, so that 

TF{a,b) > TG{a,b) - e. 

Now partition the interval [a, 6], as a = to < ' ' ' < tjy = b, so that the step 
function g is constant on each of the intervals {tj-\,tj), j = 1 , 2 , . . . , N . 
Then 


N 

TG{a,b)>Y,\G{t^)-G{t,.r)\ 

j=i 

^ ft, 

= Z2 9{t) dt 

j=i •^tj — i 


^ / \9{i)\dt 


Tj — 1 


= / \g{t)\dt. 


■b 
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Since \g{t)\ dt > \F'{t)\ dt 


e, we obtain as a consequence that 


Tp[a,h)> f \F'{t)\dt — 2€, 


and letting e — > 0 we establish the assertion and also the theorem. 

Now, any curve (viewed as the image of a mapping 1 1 — > z{t)) can in 
fact be realized by many different parametrizations. A rectifiable curve, 
however, has associated to it a unique natural parametrization, the arc- 
length parametrization. Indeed, let L{A,B) denote the length function 
(considered in Section 3.1), and for the variable t in [a, 6] set s = s{t) = 
L{a,t)- Then s{t), the arc- length, is a continuous increasing function 
which maps [a, b] to [0, L], where L is the length of the curve. The arc- 
length parametrization of the curve is now given by the pair z{s) = 
x{s) + iy{s), where z{s) = z{t), for s = s{t). Notice that in this way the 
function z{s) is well defined on [0,1/], since if s{ti) = 5 (^ 2 ), ti < ^ 2 ) then 
in fact z{t) does not vary in the interval [^1,^2] and thus z{ti) = z{t 2 )- 
Moreover |.S(si) — 5(s2)| < jsi — S 2 I 1 for all pairs si,S 2 G [0,T], since the 
left-hand side of the inequality is the distance between two points on the 
curve, while the right-hand side is the length of the portion of the curve 
joining these two points. Also, as s varies from 0 to L, z(s) traces out 
the same points (in the same order) that z(t) does as t varies from a to b. 

Theorem 4.3 Suppose {x{t),y{t)), a<t<b, is a rectifiable curve that 
has length L. Consider the arc-length parametrization z(s) = (x(s), y(s)) 
described above. Then x and y are absolutely continuous, |5'(s)| = 1 for 
almost every s G [0,1/], and 

L= [ {xfsy Fyfsfy/'^ ds. 

Jo 

Proof We noted that |5(si) — z(s 2 )| < |si — S 2 I, so it follows im- 
mediately that z{s) is absolutely continuous, hence differentiable almost 
everywhere. Moreover, this inequality also proves that [z^s)] < 1, for 
almost every s. By definition the total variation of z equals L, and by 
the previous theorem we must have L = l^^('S)| ds. Finally, we note 

that this identity is possible only when |z^(s)| = 1 almost everywhere. 


4.1* Minkowski content of a cnrve 

The proof we give below of the isoperimetric inequality depends in a key 
way on the concept of the Minkowski content. While the idea of this 
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content has an interest on its own right, it is particularly relevant for us 
here. This is because the rectiliability of a curve is tantamount to having 
(finite) Minkowski content, with that quantity the same as the length of 
the curve. 

We begin our discussion of these matters with several definitions. A 
curve parametrized by z{t) = a < t < b, is said to be simple 

if the mapping 1 1 — > z{t) is injective for t € [a, 6]. It is a closed simple 
curve if the mapping 1 1 — > z{t) is injective for t in [a, b), and z{a) = z{b). 
More generally, a curve is quasi-simple if the mapping is injective for t 
in the complement of finitely many points in [a, b]. 



Figure 8. A quasi-simple curve 


We shall find it convenient to designate by T the pointset traced out by 
the curve z{t) as t varies in [a, b], that is, T = {z{t) : a < t < b}. For any 
compact set AT C (we take K = T below), we denote by the open 
set that consists of all points at distance (strictly) less than 5 from K, 


= {xem.‘^ : d{x,K) < 5}. 



Figure 9. The curve T and the set T"^ 
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We then say that the set K has Minkowski content^ if the limit 


lim 

5^0 


m{K^) 

2S 


exists. When this limit exists, we denote it by M{K). 

Theorem 4.4 SupposeT = {z{t),a <t<h] is a quasi-simple curve. The 
Minkowski content of T exists if and only if T is rectifiable. When this 
is the case and L is the length of the curve, then Al(r) = L. 

To prove the theorem, we also consider for any compact set K 
M [K) = hmsup — and MJK) = hmmf — 

S^O ( 5^0 Zid 


(both taken as extended positive numbers) . Of course <M*{K). 

To say that the Minkowski content exists is the same as saying that 
M*{K) < oo and = M* {K). Their common value is then 

The theorem just stated is the consequence of two propositions con- 
cerning and Ai*{K). The first is as follows. 

Proposition 4.5 Suppose T = {z{t),a < t < b} is a quasi-simple curve. 

< oo, then the curve is rectifiable, and if L denotes its length, 

then 


L < M*(r). 

The proof depends on the following simple observation. 

Lemma 4.6 IfT = {z{t),a < t < b} is any curve, and A = \z{b) — z{a)\ 
is the distance between its end-points, then mfT^) > 26A. 

Proof. Since the distance function and the Lebesgue measure are 
invariant under translations and rotations (see Section 3 in Chapter 1 
and Problem 4 in Chapter 2) we may transform the situation by an 
appropriate composition of these motions. Therefore we may assume 
that the end-points of the curve have been placed on the x-axis, and 
thus we may suppose that z{a) = (A, 0), z{b) = {B,0) with A< B, and 
A = B — A [in the case A = B the conclusion is automatically verified). 

By the continuity of the function x{t), there is for each x in [A, B] a 
value t in [a, 6], such that x = x{i). Since Q = {x{t),y{t)) e T, the set 


^This is one-dimensional Minkowski content; variants are in Exercise 28 and also in 
Chapter 7 below. 
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contains a segment parallel to the y-axis, of length 2b centered at Q 
lying above x (see Figure 10). In other words the slice (r^)^; contains 
the interval (y(t) — 5,y(t) + 5), and hence mi((r‘^)®) > 2S (where mi is 
the one-dimensional Lebesgue measure). However by Fubini’s theorem 


m(F^) = / mi((F^), 


• )dx > 

Ja 


mi((F'5)^) dx > 26{B - A) = 2dA, 


and the lemma is proved. 



Figure 10. The situation in Lemma 4.6 


We now pass to the proof of the proposition. Let us assume first that 
the curve is simple. Let P be any partition a = to < ti < ■■■ < tj\[ = b 
of the interval [a,b], and let Lp denote the length of the corresponding 
polygonal line, that is, 


N 

Lp = ^\z{tj) - 
j=i 

For each e > 0, the continuity of 1 1 — > z{t) guarantees the existence of N 
proper closed sub-intervals Ij = [aj,bj] of {tj-i,tj), so that 

N 

- z{aj)\ >Lp-e. 
i=i 

Let Fj denote the segment of the curve given by F^ = {z{t)-,t € Ij}. Since 
the closed intervals Ii, . . . ,1^ are disjoint, it follows by the simplicity of 
the curve that the compact sets Fi,F 2 ,...,Fjv are disjoint. However, 
F D U^i ^ Moreover, the disjointness of the Fj 

implies that the sets (Fj)"^ are also disjoint for sufficiently small S. Hence 
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for those 6, the previous lemma applied to each Tj gives 
N 

m{T^) > ^m((rjy) > 2d^\z{bj) - z{aj)\. 
i=i 

As a result, m{r^)/{25) > Lp — e, and a passage to the limit gives 
M,{T) > Lp — e. Since this inequality is true for all partitions P and 
all e > 0, it implies that the curve is rectifiable and its length does not 
exceed Al*(r). 

The proof when the curve is merely quasi-simple is similar, except 
the partitions P considered must be refined so as to include as partition 
points those (finitely many) points in whose complement (in [a,b]) the 
mapping 1 1— > z{t) is injective. The details may be left to the reader. 

The second proposition is in the reverse direction. 

Proposition 4.7 SupposeT = {z{t),a < t < b} is a rectifiable curve with 
length L. Then 


M*{r) < L. 


The quantities Ad* (T) and L are of course independent of the parametriza- 
tion used; since the curve is rectifiable, it will be convenient to use the arc- 
length parametrization. Thus we write the curve as z{s) = {x{s),y{s)), 
with 0 < s < L, and recall that then z{s) is absolutely continuous and 
|z'(s)| = 1 for a.e. s G [0,L]. 

We first fix any 0 < e < 1, and find a measurable set cM. and a 
positive number such that m{Ee) < e and 


(14) sup 

0<\h\<r^ 


z(s + h) — z(s) 


h 


- z'{s) 


< e for all s G [0, L] — E^. 


Indeed, for each integer n, let 


Fn{s) = sup 

Q<\h\<l/n 




(where z{s) has been extended outside [0, L], so that z{s) = z(0), when 
s < 0, and z{s) = z{L) when s > L). Because z{s) is continuous the 
supremum of h in the definition oi En{s) can be replaced by a supremum 
of countably many measurable functions, and hence each is measur- 
able. However, Fn{s) — > 0, as n — > oo for a.e s G [a, b]. Thus by Egorov’s 
theorem the convergence is uniform outside a set with m{Efij < e. 


4. Rectifiable curves and the isoperimetric inequality 


141 


and so we merely need to choose = 1/n for sufficiently large n to es- 
tablish (14). It will be convenient in what follows to assume, as we may, 
that z'{s) exists and | 2 :'(s)| = 1 for every s ^ E^. 

Now for any 0 < p < (with p < 1), we partition the interval [0, L] 
into consecutive closed intervals, each of length p, (except that the last 
interval may have length < p). Then there is a total ofV<T/p-|-l such 
intervals that arise. We call these intervals /i, / 2 , . . . , /at, and divide them 
into two classes. The first class, those intervals Ij we call “good,” are the 
ones that enjoy the property that Ij E^. The second class, those which 
are “bad,” have the property that Ij C E^. As a result, U/.badfj' 
hence the union has measure < e. 

We have of course that [0, L] C U^i denote by Tj the 

segment of T given by {z(s) : s ^ Ij}, then T = U^i ^ ^ result 

r' = Uf=i(r,y and m(r^) < Ef=i^((r,)'). 

We consider first the contribution of m((rj)'^) when Ij is a good in- 
terval. Recall that for such Ij = [aj, bj] there is an sq £ Ij which is not 
in and therefore (14) holds for s = sq. Let us now visualize Lj by in- 
troducing a coordinate system such that z{so) = 0 and z'{so) = 1 (which 
we may assume after a suitable translation and rotation). We maintain 
the notations z{s) and Lj for the so transformed segment of the curve. 


aj So 


bj 



Figure 11. Estimate of m((rj)'^) for a good interval Ij 


Note that as h varies over the interval [uj — sq, bj — sq]) sq + h varies 
over Ij = [aj,bj]. Therefore is contained in the rectangle 

[aj - So - ep, bj - sq -f ep] x [-ep, ep], 
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since \h\ < p < by construction, and |2;(so + h) — h\ < e\h\ by (14). See 
Figure 11. Thus (Fj)*^ is contained in the rectangle 

[aj - So - ep - 5, bj - sq + ep + 5] x [-ep - <5, ep + <J] , 

which has measure < {p + 2ep + 2d){2ep + 25). Therefore, since e < 1, 
we have 

(15) m{{Tj)^) < 25p + 0{edp + <5^ + ep^), 

where the bound arising in O is independent of e, 6, and p. This is our 
desired estimate for the good intervals. 

To pass to the remaining intervals we use the fact that |z(s) — z(s')| < 
|s — s' I for all s and s' . Thus in every case Fj is contained in a ball 
(disc) of radius p, and hence (Fj)*^ is contained in a ball of radius p+ 5. 
Therefore we have the crude estimate 


(16) 


m{{V,f) = 0{5^ + p^). 


We now sum (15) over the good intervals (of which there are at most 
L/p+ 1), and (16) over the bad intervals. There are at most e/p+ 1 
of the latter kind, since their union is included in E,. and this set has 
measure < e. Altogether, then, 

m(F^) < 25L + 25p + 0{e5 + <5^/ p + ep) + O ((e/ p + 1)(<^^ + P^)) > 


which simplihes to the inequalities 


m(F^) 

25 


< T + O ( p 


5 ep e5 ^ p^ 
e+- + -^ + — + 5+^ 
pop 5 


<L + 0 


5 ep 


5 


where in the last line we have used the fact that e < 1 and p < 1. In 
order to obtain a favorable estimate from this as <5 — > 0, we need to 
choose p (the length of the sub-intervals) very roughly of the same size 
as 5. An effective choice is p = 5le^l’^ . If we fix this choice and restrict 
our attention to <5 for which 0 < 5 < then automatically p 

as required by (14). Inserting p = 5le^^‘^ in the above inequality gives 


m(F^) 


< L-hO 


a /2 


1/2 ^ 
+ e + e^/^ + - 


25 
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and thus 



Now we can let e — > 0 to obtain the desired conclusion Al*(r) < L, and 
the proofs of the proposition and theorem are complete. 

4 . 2 * Isoperimetric inequality 

The isoperimetric inequality in the plane states, in effect, that among all 
curves of a given length it is the circle that encloses the maximum area. 
A simple form of this theorem already appeared in Book I. While the 
proof given there had the virtue of being brief and elegant, it did suffer 
several shortcomings. Among them the “area” in the statement was 
defined indirectly via a technical artifice, and the scope of the conclusion 
was limited because only relatively smooth curves were considered. Here 
we want to remedy those defects and deal with a general version of the 
result. 

We suppose that H is a bounded open subset of and that its bound- 
ary H — H, is a rectifiable curve B, with length ^(B). We do not require 
that B be a simple closed curve. The isoperimetric theorem then asserts 
the following. 

Theorem 4.8 4'7rm(H) < .^(B)^. 

Proof. Bor each 5 > 0 we consider the outer set 

= {x G : d{x,U) < <5}, 

and the inner set 


H_((5) = {x € > (i}. 


Thus n_{5) cnc n+{S). 

We notice that for B^ = {x : d{x,r) < 5} we have 


B!+(5) = B!_(5)UB^ 


(17) 


and that this union is disjoint. Moreover, if D{S) is the open ball (disc) 
of radius <5 centered at the origin, D{5) = {x G R^, |x| < (5}, then clearly 


B!+(<5) D ^l + D{5), 

n D n^(d) + B(S). 


( 18 ) 
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We now apply the Brunn-Minkowski inequality (Theorem 5.1 in Chap- 
ter 1) to the first inclusion, and obtain 

m{Q+{5)) > . 

Since m{D{5)) = (this standard formula is established in Exercise 14 
in the previous chapter), and {A -|- + 2AB whenever A and B 

are positive, we find that 

m(ri_|_((5)) > m(fi) -|- 

Similarly, m{Q) > -|- 27r^/^(5m(ri_(5))^/^ using the second in- 

clusion in (18), which implies 

Now by (17) 

m{T^) = m{Q+{5)) — 
and by the inequalities above, we have 

m{T^) > 2TT^^‘^S{m{0.)^^^ + 

We now divide both sides by 26 and take the limsup as 5 — > 0. This 
yields 


M*{T) > 7ri/2(2m(n)i/2) 
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since 17 as <5 ^ 0. However, by Proposition 4.7, ^(P) > A1*(P), 

so 


i{T) > 

which proves the theorem. 

Remark. A similar result holds even without the assumption that the 
boundary is a (rectifiable) curve. In fact the proof shows that for any 
bounded open set 17 whose boundary is P we have 

47rm(17) < A1*(P)^ 


5 Exercises 

1 . Suppose ip is an integrable function on with p(x) dx = 1. Set Ks{x) = 
5~‘^(p{x/S), 5 > 0. 

(a) Prove that {Ks}s>o is a family of good kernels. 

(b) Assume in addition that p is bounded and supported in a bounded set. 
Verify that {A 6 } 5 >o is an approximation to the identity. 

(c) Show that Theorem 2.3 (convergence in the L^-norm) holds for good kernels 
as well. 


2 . Suppose {Ks} is a family of kernels that satisfies: 

(i) \Ks{x)\ < AS-'^ for all 5 > 0. 

(il) \Ks{x)\ < AS/\x\'^^^ for all 5 > 0. 

(iii) Ks(x) dx = 0 for all 5 > 0. 

Thus Ks satisfies conditions (i) and (ii) of approximations to the identity, but the 
average value of Ks is 0 instead of 1. Show that if / is integrable on R'*, then 

(/ * Ks){x) —> 0 for a.e. x, as S —> 0. 


3. Suppose 0 is a point of (Lebesgue) density of the set i? C R. Show that for each 
of the individual conditions below there is an infinite sequence of points Xn G E, 
with Xn 7 ^ 0, and Xn ^ 0 as n ^ oo. 

(a) The sequence also satisfies —Xn £ E for all n. 

(b) In addition, 2a;„ belongs to E for all n. 
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Generalize. 


4. Prove that if / is integrable on R'*, and / is not identically zero, then 


/*(®) > TTd’ some c > 0 and all |a;| > 1. 

\x\ 


Conclude that f* is not integrable on R'^. Then, show that the weak type estimate 

m{{x : f*{x) > a}) < c/a 

for all a > 0 whenever J \ f\ — 1, is best possible in the following sense: if / is 
supported in the unit ball with / |/| = 1, then 

m{{x : f*{x) > a}) > c /a 

for some c' > 0 and all sufficiently small a. 

[Hint: For the first part, use the fact that Jg \ f\ > 0 for some ball B.] 


5. Consider the function on R defined by 


/(a;) 



1 


|a;|(logl/|x|) 

0 


2 


if |2;| < 1/2, 
otherwise. 


(a) Verify that / is integrable. 

(b) Establish the inequality 

f*ix) > — -r; — — nr for some c > 0 and all |a:| < 1/2, 
' ^ ^ - |x|(log 1/1*1) II-/: 

to conclude that the maximal function /* is not locally integrable. 


6. In one dimension there is a version of the basic inequality (1) for the maximal 
function in the form of an identity. We define the “one-sided” maximal function 

/;(*)= sup i / \f{y)\dy. 

h>0 ^ Jx 

If Ea = {x e M : /+(x) > a}, then 

m{E+) = t / \fiy)\dy. 

« Je+ 

[Hint: Apply Lemma 3.5 to F{x) = \f{y)\ dy — ax. Then is the union of 

disjoint intervals (aj,, bk) with \f{y)\ dy = a{ak — bk).] 

'^k 
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7 . Using Corollary 1.5, prove that if a measurable subset E of [0, 1] satisfies 
m(E n 7) > am{l) for some a > 0 and all intervals I in [0, 1], then E has measure 
1. See also Exercise 28 in Chapter 1. 

8. Suppose A is a Lebesgue measurable set in R with m{A) > 0. Does there exist 

a sequence such that the complement of + s„) in R has measure 

zero? 

[Hint: For every e > 0, find an interval E of length such that m{Ar]E) > 
(1 — Consider U^-oo(^ + with tk = kEe- Then vary e.] 

9. Let E be a closed subset in R, and 5{x) the distance from x to E, that is, 


(5(a;) = d{x, F) — inf{|a; — y\ : y € F}. 

Clearly, 5{x + y) < \y\ whenever x £ F. Prove the more refined estimate 


5{x + y) — o(|j/|) for a.e. x £ F, 


that is, S{x + y)/\y\ —>■ 0 for a.e. x £ F. 

[Hint: Assume that x is a point of density of F.] 

10. Construct an increasing function on R whose set of discontinuities is pre- 
cisely Q. 

11. If a, 6 > 0, let 


x“ sin(x **) for 0 < x < 1, 
0 if X = 0. 



Prove that / is of bounded variation in [0, 1[ if and only if a > fe. Then, by tak- 
ing a = b, construct (for each 0 < a < 1) a function that satisfies the Lipschitz 
condition of exponent a 


\f{x)- f{y)\ < A[x-2/[“ 


but which is not of bounded variation. 

[Hint: Note that if /i > 0, the difference [/(x -\- h) — f(x)\ can be estimated by 
C{x + h)°', or C'hjx by the mean value theorem. Then, consider two cases, 
whether x“"'"^ > h ot x“^^ < h. What is the relationship between a and a?| 

12. Consider the function Fix') = x^sin(l/x^), x 7^ 0, with E(0) = 0. Show that 
F'ix) exists for every x, but F' is not integrable on [—1, 1[. 

13. Show directly from the definition that the Cantor-Lebesgue function is not 
absolutely continuous. 

14. The following measurability issues arose in the discussion of differentiability 
of functions. 
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(a) Suppose F is continuous on [a, 6]. Show that 


D+{F){x) 


lim sup 

h ^ 0 
h > 0 


F{x + h)-F{x) 
h 


is measurable. 


(b) Suppose J{x) = <^njn{x) is a jump function as in Section 3.3. Show 

that 


lim sup 

h^O 


J(x + h) — J{x) 
h 


is measurable. 


[Hint: For (a), the continuity of F allows one to restrict to countably many h in tak- 
ing the limsup. For (b), given k > m, let | | , 

where Jn{x) = o„jn(x). Note that each F^^ is measurable. Then, succes- 
sively, let — > oo, fc — > oo, and finally m —> oo.] 


15. Suppose F is of bounded variation and continuous. Prove that F = Fi — F 2 , 
where both Fi and F 2 are monotonic and continuous. 


16. Show that if F is of bounded variation in [a, b], then: 

(a) j!^\F'{x)\dx <TF{a,b). 

(b) |F’'(a;)| dx = Tpia, b) if and only if F is absolutely continuous. 

As a result of (b), the formula L = l■S^(^)| dt for the length of a rectifiable curve 

parametrized by « holds if and only if « is absolutely continuous. 

17. Prove that if {Ae}e>o is a family of approximations to the identity, then 

sup\{f ^ K^){x)\ < cf*(x) 

e>0 

for some constant c > 0 and all integrable /. 

18. Verify the agreement between the two definitions given for the Cantor-Lebesgue 
function in Exercise 2, Chapter 1 and in Section 3.1 of this chapter. 


19. Show that if / : R ^ R is absolutely continuous, then 

(a) / maps sets of measure zero to sets of measure zero. 

(b) / maps measurable sets to measurable sets. 
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20. This exercise deals with functions F that are absolutely continuous on [a, b] 
and are increasing. Let A = F{a) and B — F{b). 

(a) There exists such an F that is in addition strictly increasing, but such that 
F'{x) = 0 on a set of positive measure. 

(b) The F in (a) can be chosen so that there is a measurable subset E C [A, B] , 
m(E) = 0, so that F~^{E) is not measurable. 

(c) Prove, however, that for any increasing absolutely continuous F, and E a 
measurable subset of [A, B], the set F~^{E) n {F'{x) > 0} is measurable. 

[Hint: (a) Let F{x) = xk{x) dx, where K is the complement of a Cantor-like 
set C of positive measure. For (b), note that F{C) is a set of measure zero. Finally, 
for (c) prove first that m{0) = F'{x) dx for any open set O] 

21. Let F be absolutely continuous and increasing on [a, 6] with F{a) — A and 
F{b) = B. Suppose / is any measurable function on [A, B]. 

(a) Show that f{F{x))F' (x) is measurable on [a, 6]. Note: f{F{x)) need not be 
measurable by Exercise 20 (b). 

(b) Prove the change of variable formula: If / is integrable on [A, B] , then so is 


f{F{x))F'{x), and 



[Hint: Start with the identity m{0) = ^'{x) dx used in (c) of Exercise 20 

above.] 

22. Suppose that F and G are absolutely continuous on [a,b]. Show that their 
product FG is also absolutely continuous. This has the following consequences. 

(a) Whenever F and G are absolutely continuous in [a, 6], 




F' {x)G{x) dx = — / F{x)G' ( x) dx + [F{x)G{x)]a- 


(b) Let F be absolutely continuous in [— tt, tt] with F{'k) = F{—n). Show that 


if 



such that F[x) ~ then 


r (x) ~ > manC 
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(c) What happens if F{—tv) 7^ F{'k)? [Hint: Consider F{x) = x.] 


23 . Let F be continuous on [a, 6]. Show the following. 

(a) Suppose {D'^F)(x) > 0 for every x £ [a, b]. Then F is increasing on [a, bj. 

(b) If F'{x) exists for every x £ {a,b) and |T'(a;)| < M, then \F{x) — F{y)\ < 
M\x — y\ and F is absolutely continuous. 

[Hint: For (a) it suffices to show that F{b) — F{a) > 0. Assume otherwise. Hence 
with Ge{x) = F{x) — F{a) + e{x — a), for sufficiently small e > 0 we have Ge(a) = 
0, but Ge(6) < 0. Now let xo £ [a, b) be the greatest value of xo such that Ge(2;o) > 
0. However, {D^G^){xq) > 0.] 

24 . Suppose F is an increasing function on [a, b] . 

(a) Prove that we can write 


F = Fa + Fc + Fj , 


where each of the functions Fa, Fc, and Fj is increasing and: 

(i) Fa is absolutely continuous. 

(ii) Fc is continuous, but Fc(x) = 0 for a.e. x. 

(iii) Fj is a jump function. 

(b) Moreover, each component Fa, Fc, Fj is uniquely determined up to an 
additive constant. 

The above is the Lebesgue decomposition of F. There is a corresponding 
decomposition for any F of bounded variation. 

25 . The following shows the necessity of allowing for general exceptional sets of 
measure zero in the differentiation Theorems 1.4, 3.4, and 3.11. Let E be any set 
of measure zero in Show that: 

(a) There exists a non- negative integrable / in R'*, so that 



for each x £ E. 


(b) When d — 1 this may be restated as follows. There is an increasing abso- 
lutely continuous function F so that 


D+{F){x) = D-{F){x) = 00, for each x £ E. 
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[Hint: Find open sets 0„ D E, with m{On) < 2 ", and let f(x) = XOn(®)-] 

26. An alternative way of defining the exterior measure m,{E) of an arbitrary set 
E, as given in Section 2 of Chapter 1, is to replace the coverings of E by cubes 
with coverings by balls. That is, suppose we define mf{E) as inf 
where the infimum is taken over all coverings E C U^i by open balls. Then 
mt.{E) = mf{E). (Observe that this result leads to an alternate proof that the 
Lebesgue measure is invariant under rotations.) 

Clearly < m^{E). Prove the reverse inequality by showing the follow- 

ing. For any e > 0, there is a collection of balls {Bj} such that E C Uj while 
X)j m{Bj) < mf{E) + e. Note also that for any preassigned S, we can choose the 
balls to have diameter < 5. 

[Hint: Assume first that E is measurable, and pick O open so that O D E and 
m{0 — E) <e. Next, using Corollary 3.10, find balls B\,...,Bn such that 
^{Bj) < m{E) + 2e and m{E — U^i Bj) < 3e^ Finally, cover E — U^i Bj 
by a union of cubes, the sum of whose measures is < 4e', and replace these cubes 
by balls that contain them. For the general E, begin by applying the above when 
E is a. cube.[ 


27. A rectifiable curve has a tangent line at almost all points of the curve. Make 
this statement precise. 


28. A curve in is a continuous map 1 z{t) of an interval [a, b) into R"^. 

(a) State and prove the analogues of the conditions dealing with the rectifiability 
of curves and their length that are given in Theorems 3.1, 4.1, and 4.3. 

(b) Define the (one-dimensional) Minkowski content A4{K) of a compact set in 
R"* as the limit (if it exists) of 


m{K^) 

md-iiB{S)) 


as (5 ^ 0, 


where md-i(B{5)) is the measure (in R"^”^) of the ball defined by B{5) = 
{x £ \x\ < 5}. State and prove analogues of Propositions 4.5 and 4.7 

for curves in R'*. 


29. Let F = {z{t), a < t < b} be a curve, and suppose it satisfies a Lipschitz 
condition with exponent a, 1/2 < a < 1, that is, 

\z{t) — z{t')\ < A\t — t'\°‘ for all t, t' £ [a, 6]. 

Show that m{r^) = for 0 < b < 1. 

30. A bounded function F is said to be of bounded variation on R if T is of 
bounded variation on any hnite sub-interval [a, b[, and sup,j Tpi^a, b) < oo. 

Prove that such an E enjoys the following two properties: 
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(a) Jjj I _F(a: + /i) — 1^(3;) | dx < A\h\, for some constant A and all /i £ R. 

(b) \ (x) dx\ < A, where ip ranges over all functions of bounded 

support with sup^gg \ip{x)\ < 1. 

For the converse, and analogues in R'*, see Problem 6* below. 

[Hint: For (a), write F = F\ — F 2 , where Fj are monotonic and bounded. For (b), 
deduce this from (a).] 

31 . Let F be the Cantor-Lebesgue function described in Section 3.1. Consider the 
curve that is the graph of F, that is, the curve given by x{t) = t and y{t) = F{t) 
with 0 < t < 1. Prove that the length L{x) of the segment 0 < t < 5; of the curve 
is given by L{x) = x + F{x). Hence the total length of the curve is 2. 

32 . Let / : R — > R. Prove that / satisfies the Lipschitz condition 

\f{x) - f{y)\ < M\x - y\ 

for some M and all x, 1 / £ R, if and only if / satisfies the following two properties: 

(i) / is absolutely continuous. 

(ii) |/^(3:)| < M for a.e. x. 

6 Problems 

1. Prove the following variant of the Vital! covering lemma: If E is covered in 
the Vitali sense by a family B of balls, and 0 < < 00 , then for every r/ > 0 

there exists a disjoint collection of balls in B such that 



2 . The following simple one-dimensional covering lemma can be used in a number 
of different situations. 

Suppose Ii, I 2 , ■ ■ ■ , In is a given finite collection of open intervals in R. Then 
there are two finite sub-collections I[, I 2 , . . . , I'x , and 7", 72 , ... , I'l, so that each 
sub-collection consists of mutually disjoint intervals and 


JV 


K 


L 


U ^3 = U u u 7 ; 


fc=i r=i 


Note that, in contrast with Lemma 1.2, the full union is covered and not merely a 
part. 
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[Hint: Choose I[ to be an interval whose left end-point is as far left as possible. 
Discard all intervals contained in /(. If the remaining intervals are disjoint from 
I[, select again an interval as far to the left as possible, and call it Otherwise 
choose an interval that intersects bnt reaches ont to the right as far as possible, 
and call this interval 7". Repeat this procednre.] 

3.* There is no direct analogne of Problem 2 in higher dimensions. However, a full 
covering is afforded by the Besicovitch covering lemma. A version of this lemma 
states that there is an integer N (dependent only on the dimension d) with the 
following property. Suppose E is any bounded set in R'* that is covered by a 
collection B of balls in the (strong) sense that for each x £ E, there is a 73 € B 
whose center is x. Then, there are N sub-collections Bi, B 2 , ■ ■ ■ , Bn of the original 
collection B, such that each Bj is a collection of disjoint balls, and moreover, 

73 C [J 73, where B' = Bi U B 2 U ••• U Bjv. 

BSB' 


4. A real-valued function defined on an interval (a, b) is convex if the region 
lying above its graph {{x,y) £ ■. y > p{x), a < x < b} is a convex set, as defined 

in Section 5*, Chapter 1. Equivalently, is convex if 

ip{9xi + (1 — 9)x 2) < Oip{xi) -I- (1 — 0 )p{x 2 ) 


for every xi,X 2 £ (a, b) and 0 < 0 < 1. One can also observe as a consequence that 
we have the following inequality of the slopes: 

p{x + h)-<p{x) ^ (p{y) - <p{x) ^ <fi{y) - ipjy - h) 

h ~ y — X ~ h ’ 

whenever x < y, h > 0, and x + h < y. 

The following can then be proved. 

(a) ip is continuous on (a, b) . 

(b) p satisfies a Lipschitz condition of order 1 in any proper closed sub-interval 
[o', 6'] of (a, 6). Hence p is absolutely continuous in each sub-interval. 

(c) p' exists at all but an at most denumerable number of points, and p' — D^p 
is an increasing function with 

p{y)-p{x)= f p{t)dt. 

J X 

(d) Conversely, if V' is any increasing function on (o, b), then p{x) = ip{t) dt 
is a convex function in (a, b) (for c £ {a, b)). 


5. Suppose that F is continuous on [a,b], F'{x) exists for every x £ {a,b), and 
F'{x) is integrable. Then F is absolutely continuous and 

F{b) - F{a) = / F'{x) dx. 
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X X + h y — h y 


Figure 13. A convex function 


[Hint: Assume F'{x) > 0 for a.e. x. We want to conclude that F{b) > F{a). Let 
E be the set of measure 0 of those x such that F' (x) < 0. Then according to 
Exercise 25, there is a function 4> which is increasing, absolutely continuous, and for 
which D'^^(x) = oo, X G E. Consider F + (54>, for each <5 and apply the result (a) 
in Exercise 23.) 

6. * The following converse to Exercise 30 characterizes functions of bounded vari- 
ation. 

Suppose F is a bounded measurable function on R. If F satisfies either of 
conditions (a) or (b) in that exercise, then F can be modihed on a set of measure 
zero so as to become a function of bounded variation on R. 

Moreover, on R'* we have the following assertion. Suppose F is a bounded 
measurable function on R'^. Then the following two conditions on F are equivalent: 

(a') Jjjd \F{x + h) — F(x)\ dx < A\h\, for all h G R'^. 

(b') I /gd F{x) dx\ < A, for all j = 1, . . . , d, 

for all ip G that have bounded support, and for which sup^^gjjd |¥’(a;)| < 1- 
The class of functions that satisfy either (a') or (b') is the extension to R'* of 
the class of functions of bounded variation. 

7. Consider the function 


p / \ \ ^ r\ — Tl 2TTi2^X 

n=0 

(a) Prove that /i satishes |/i(a:) — fi{y)\ < Aa\x — y\°‘ for each 0 < a < 1. 

(b) * However, /i is nowhere differentiable, hence not of bounded variation. 
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8.* Let TZ denote the set of all rectangles in that contain the origin, and with 
sides parallel to the coordinate axis. Consider the maximal operator associated to 
this family, namely 


/^(®) = sup — ^ f \f{x-y)\dy. 

Ren rn(R) J ^ 

(a) Then, f >—> does not satisfy the weak type inequality 

m({x : f^{x) > a}) < - ||/||^i 
a 

for all a > 0, all integrable /, and some ^4 > 0. 

(b) Using this, one can show that there exists / G L^(R) so that for R£TZ 

limsup — f f{x — y)dy = oo for almost every x. 

diam(R)^0 m{R) J 

Here diam(R) = sup„, \x — y\ equals the diameter of the rectangle. 

[Hint: For part (a), let B be the unit ball, and consider the function ip{x) = 
Xb{x) /m{B). For 5 > 0, let ips{x) = 5~^ifi{x/S). Then 

{‘Ps)Rix) - — 1 — r as 5 ^ 0, 

jXij |X2| 

for every {xi,X 2 ), with xiX 2 ^ 0. If the weak type inequality held, then we would 
have 

< 1 : [ 2 : 12 : 21 ”^ > a}) < — . 

a 

This is a contradiction since the left-hand side is of the order of (log a)la as a 
tends to infinity.] 


4 Hilbert Spaces: An 

Introduction 


Born barely 10 years ago, the theory of integral eqna- 
tions has attracted wide attention as much as for its 
inherent interest as for the importance of its applica- 
tions. Several of its results are already classic, and no 
one doubts that in a few years every course in analysis 
will devote a chapter to it. 

M. Plancherel, 1912 


There are two reasons that account for the importance of Hilbert 
spaces. First, they arise as the natural infinite-dimensional generaliza- 
tions of Euclidean spaces, and as such, they enjoy the familiar properties 
of orthogonality, complemented by the important feature of complete- 
ness. Second, the theory of Hilbert spaces serves both as a conceptual 
framework and as a language that formulates some basic arguments in 
analysis in a more abstract setting. 

For us the immediate link with integration theory occurs because of 
the example of the Lebesgue space The related example of 

LHi — TT, tt]) is what connects Hilbert spaces with Fourier series. The 
latter Hilbert space can also be used in an elegant way to analyze the 
boundary behavior of bounded holomorphic functions in the unit disc. 

A basic aspect of the theory of Hilbert spaces, as in the familiar finite- 
dimensional case, is the study of their linear transformations. Given the 
introductory nature of this chapter, we limit ourselves to rather brief 
discussions of several classes of such operators: unitary mappings, pro- 
jections, linear functionals, and compact operators. 

1 The Hilbert space 

A prime example of a Hilbert space is the collection of square inte- 
grable functions on which is denoted by and consists of 

all complex- valued measurable functions / that satisfy 


|/(x)p dx < oo. 
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The resulting T^(IR‘^)-norm of / is defined by 



1/2 


The reader should compare those definitions with these for the space 
of integrable functions and its norm that were described in Sec- 
tion 2, Chapter 2. A crucial difference is that has an inner product, 
which does not. Some relative inclusion relations between those spaces 
are taken up in Exercise 5. 

The space is naturally equipped with the following inner prod- 

uct: 



{f,9)= f{x)g{x)dx, whenever /, 5 G 


which is intimately related to the L^-norm since 


if, — II/IIl2(R‘^)- 


As in the case of integrable functions, the condition ||/||L 2 (Kd) = 0 only 
implies f{x) = 0 almost everywhere. Therefore, we in fact identify func- 
tions that are equal almost everywhere, and define as the space 

of equivalence classes under this identification. However, in practice it is 
often convenient to think of elements in as functions, and not as 

equivalence classes of functions. 

For the definition of the inner product (/, g) to be meaningful we need 
to know that fg is integrable on R'* whenever / and g belong to 
This and other basic properties of the space of square integrable functions 
are gathered in the next proposition. 

In the rest of this chapter we shall denote the L^-norm by || ■ || (drop- 
ping the subscript unless stated otherwise. 

Proposition 1.1 The space Z/^(IR.'^) has the following properties: 

(i) is a vector space. 

(ii) f{x)g{x) is integrable whenever f,g & and the Cauchy- 

Schwarz inequality holds: \{f,g)\ < ||/|| ||5||- 

(iii) If g a is fixed, the map f i— > {f,g) is linear in f, and also 


{f,9) = {gJ)- 


(iv) The triangle inequality holds: ||/-|- (?|| < ||/|| -f ||g'||. 
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Proof. If f,g e then since \f{x) + g{x)\ < 2max(|/(a;)|, |ff(x)|), 

we have 

\f{x) + g{x)\‘^ < 4(|/(x)|2 + \g{x)\^), 

therefore 

j IZ + ^r < 4 j I/P +4 j | 5 p < oo, 

hence f + g a Also, if A C C we clearly have A/ S Z/^(IR‘^), and 

part (i) is proved. 

To see why fg is integrable whenever / and g are in it suffices 

to recall that for all A,B>0, one has 2AB < + B^, so that 

( 1 ) / \m<l[\\f\\^+\\9r]- 

To prove the Cauchy-Schwarz inequality, we first observe that if either 
ll/ll = 0 or ||( 7 || = 0, then fg = 0is zero almost everywhere, hence (/, g) = 

0 and the inequality is obvious. Next, if we assume that ||/|| = H^H = 1, 
then we get the desired inequality \{f,g)\ < 1. This follows from the fact 
that I (/,(?) I < f \fg\, and inequality (1). Finally, in the case when both 
ll/ll and ll^ll are non-zero, we normalize / and g by setting 

/= //ll/ll and 5 = 5-/115-11, 

so that ll/ll = II 5 II = 1. By our previous observation we then find 

l(/-5)l < 1 - 

Multiplying both sides of the above by ||/|| || 5 || yields the Cauchy-Schwarz 
inequality. 

Part (iii) follows from the linearity of the integral. 

Finally, to prove the triangle inequality, we use the Cauchy-Schwarz 
inequality as follows: 

11/ + fflP = (/ + 1?! / + ff) 

= ll/lP + (/-S') + ( 5 -/) + IlfflP 
<||/f + 2|(/,5)| + ||9f 
<ll/f + 2II/II llsll + lkf 
= (ll/ll + llffll)^ 


and taking square roots completes the argument. 
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We turn our attention to the notion of a limit in the space 
The norm on induces a metric d as follows: if /, <7 G then 

d{f,g) = \\f - gh^s.'^)- 

A sequence {/„} C is said to be Cauchy if d{fn, fm) — > 0 as 

n, m — > 00 . Moreover, this sequence converges to / G if d{fn, /) — » 

0 as n — > 00 . 

Theorem 1.2 The space is complete in its metric. 

In other words, every Cauchy sequence in converges to a function 

in This theorem, which is in sharp contrast with the situation for 

Riemann integrable functions, is a graphic illustration of the usefulness 
of Lebesgue’s theory of integration. We elaborate on this point and its 
relation to Fourier series in Section 3 below. 

Proof. The argument given here follows closely the proof in Chapter 2 
that is complete. Let be a Cauchy sequence in L^, and 

consider a subsequence {fnk}T=i ifn} with the following property: 

Wfuk+i - fnk II < 2 “'“, for all A: > 1 . 

If we now consider the series whose convergence will be seen below, 

00 

fix) = fm{x) + YjUn>^+ii^) - fnAx)) 

k=l 


and 

00 

gix) = \fm{x)\ + \{fnk+^ix) - fnkix))\, 

k=l 


together the partial sums 


K 

SKif){x) = fnAx) + X] “ fnAx)) 

k=l 


K 

SK{g)ix) = \fm{x)\ + Y \fnk + iix) - fnk{x)\, 

k=l 


and 
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then the triangle inequality implies 


K 

||5A(l7)||<||/nJ|+j;i|/n.+,-/nJ| 

k=l 

K 

k=l 

Letting K tend to infinity, and applying the monotone convergence theo- 
rem proves that f \g\^ < oo, and since |/| < g, we must have / G 

In particular, the series defining / converges almost everywhere, and 
since (by construction of the telescopic series) the {K — 1)*^^ partial sum 
of this series is precisely fnj ^ , we find that 

fnAx)^f{x) a.e. X. 

To prove that — > / in as well, we simply observe that \f — 

5'r'(/)P < ( 25)^ for all K, and apply the dominated convergence theorem 
to get Wfrik — /II — > 0 as fc tends to infinity. 

Finally, the last step of the proof consists of recalling that {/„} is 
Cauchy. Given e, there exists N such that for all n,m > N we have 
Wfn - /mil < e/2. If rik is chosen so that Uk > N, and H/n^ - /|| < e/2, 
then the triangle inequality implies 

Wfn - /II < Wfn - fnj + Wfn, “ /II < e 

whenever n > N. This concludes the proof of the theorem. 

An additional useful property of is contained in the following 

theorem. 

Theorem 1.3 The space is separable, in the sense that there 

exists a countable collection {fk} of elements in such that their 

linear combinations are dense in 

Proof. Consider the family of functions of the form rxnix), where r 
is a complex number with rational real and imaginary parts, and R is 
a rectangle in with rational coordinates. We claim that finite linear 
combinations of these type of functions are dense in 

Suppose / G and let e > 0. Consider for each n > 1 the func- 

tion gn defined by 

/ \ f f{x) if |x| < n and |/(x)| < n, 

= I Cl 
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Then \ f — 5nP < 4|/p and gn{x) — > f{x) almost everywhere.^ The dom- 
inated convergence theorem implies that ||/ — 5n||b(M<i) ^ tends 

to infinity; therefore we have 

11/ - 5 w||L 2 (Rrf) < e/2 for some N. 

Let g = gNi and note that is a bounded function supported on a 
bounded set; thus g e We may now find a step function ip so 

that \ip\ < N and ^ \g — ip\ < /IQN (Theorem 2.4, Chapter 2). By re- 
placing the coefficients and rectangles that appear in the canonical form 
of ip by complex numbers with rational real and imaginary parts, and 
rectangles with rational coordinates, we may find a ^ with \il)\ < N and 
f \g — ipl < e'^/8N. Finally, we note that 

I b-V’P <2Ar I \g-^\<eyA. 

Consequently ||(? — V’ll < e/2, therefore ||/ — ^|| < e, and the proof is 
complete. 

The example possesses all the characteristic properties of a 

Hilbert space, and motivates the definition of the abstract version of this 
concept. 

2 Hilbert spaces 

A set is a Hilbert space if it satisfies the following: 

(i) is a vector space over C (or R).^ 

(ii) TL is equipped with an inner product (■,■), so that 

• / I— > (/, 5 ) is linear on Ti for every fixed 5 S , 

• {f,9) = {gJ), 

• (/, /) > 0 for all fen. 

We let ll/ll = (/,/)V2. 

(iii) II / II = 0 if and only if / = 0. 


^By definition / E implies that |/|^ is integrable, hence f{x) is finite for a.e x. 

^At this stage we consider both cases, where the scalar field can be either C or R. 
However, in many applications, such as in the context of Fourier analysis, one deals 
primarily with Hilbert spaces over C. 
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(iv) The Cauchy-Schwarz and triangle inequalities hold 

|(/,g)| < ll/ll ||5|| and ||/ + g|| < ||/|| + || 5 || 

for all 

(v) H is complete in the metric d{f,g) = \\f — g\\. 

(vi) H is separable. 

We make two comments about the definition of a Hilbert space. First, 
the Cauchy-Schwarz and triangle inequalities in (iv) are in fact easy 
consequences of assumptions (i) and (ii). (See Exercise 1.) Second, we 
make the requirement that Ti. be separable because that is the case in 
most applications encountered. That is not to say that there are no 
interesting non-separable examples; one such example is described in 
Problem 2. 

Also, we remark that in the context of a Hilbert space we shall of- 
ten write limn^cx) /n = / or /„ ^ / to mean that lim„^oo ||/n - /|| = 0, 
which is the same as d{fn, f) — > 0. 

We give some examples of Hilbert spaces. 

Example 1. If E is a measurable subset of with m{E) > 0, we let 
L‘^{E) denote the space of square integrable functions that are supported 
on E, 


L^{E) = \ f supported on E, so that J \f{x)f dx < oo 


The inner product and norm on LP‘{E) are then 


■(/, 

Once again, we consider two elements of L'^{E) to be equivalent if they 
differ only on a set of measure zero; this guarantees that ||/|| = 0 implies 
/ = 0. The properties (i) through (vi) follow from these of //^(R"^) proved 
above. 

Example 2. A simple example is the finite-dimensional complex Eu- 
clidean space. Indeed, 


{f,9)= [ f{x)g{x)dx and ||/|| 
Je 



— {(ai, . . . , a^r) : aj, € C} 


2. Hilbert spaces 


163 


becomes a Hilbert space when equipped with the inner product 

N 

k=l 


where a = (ai, . . . , a^) and b = {bi, . . . , bjy) are in C^. The norm is then 

X 1/2 


a = 


'' N N 

Ek/ 

\k=l t 


One can formulate in the same way the real Hilbert space 


pAT 


Example 3. An infinite-dimensional analogue of the above example is 
the space By definition 


, a_ 2 , a_i, ao, ai, . . .) : € C, 


OO 

E 


< OO 


If we denote infinite sequences by a and b, the inner product and norm 
on ^^(Z) are 


(a, 5) = 


CxD 

E 

k= — oo 


akbk and 



We leave the proof that ^^(Z) is a Hilbert space as Exercise 4. 

While this example is very simple, it will turn out that all infinite- 
dimensional (separable) Hilbert spaces are ^^(Z) in disguise. 

Also, a slight variant of this space is ^^(N), where we take only one- 
sided sequences, that is, 

{ CXD 

(01,02,...): Oi e C, |onp < OO 

n=l 

The inner product and norm are then defined in the same way with the 
sums extending from n = 1 to oo. 

A characteristic feature of a Hilbert space is the notion of orthogo- 
nality. This aspect, with its rich geometric and analytic consequences, 
distinguishes Hilbert spaces from other normed vector spaces. We now 
describe some of these properties. 
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2.1 Orthogonality 

Two elements / and 5 in a Hilbert space Ti with inner product (■, ■) are 

orthogonal or perpendicular if 

(/; 9) = 0) we then write / T 5. 

The first simple observation is that the usual theorem of Pythagoras 
holds in the setting of abstract Hilbert spaces: 

Proposition 2.1 If f L g, then \\f + gf = ||/f + \\gf. 

Proof It suffices to note that (/, 5) = 0 implies {g, f) = 0 , and there- 
fore 


11 / + 5IP — (/ + 5 ) / + 5) — ll/lP + (/> ff) + ( 5 : /) + II5 

= ll/f + Il 5 f. 


A finite or countably infinite subset {ei, 62, . . .} of a Hilbert space H 

is orthonormal if 

, . j 1 when k = i, 

{ek,ee) ^ ^ when A; 7^ t. 

In other words, each has unit norm and is orthogonal to whenever 

e^k. 

Proposition 2.2 If{ek}^^i orthonormal, and f = '^OkOk G H where 
the sum is finite, then 

ii/f = 

The proof is a simple application of the Pythagorean theorem. 

Given an orthonormal subset {ei,e2, • • •} = {ek}^i of hi, a natural 
problem is to determine whether this subset spans all of hi, that is, 
whether finite linear combinations of elements in {ei,e2, • • •} are dense 
in If this is the case, we say that is an orthonormal basis 

for H. If we are in the presence of an orthonormal basis, we might expect 
that any f takes the form 


CXD 

f ^ ^ 5 

k=l 
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for some constants ak € C. In fact, taking the inner product of both 
sides with Cj, and recalling that {ck} is orthonormal yields (formally) 

(/) ^j) ~ 0,j. 

This question is motivated by Fourier series. In fact, a good insight 
into the theorem below is afforded by considering the case where Ti 
is L^([— TT, tt]) with inner product (/,g) = ^ f{x)g{x) dx, and the 
orthonormal set {ek}'^—i is merely a relabeling of the exponentials 

jn= — oo‘ 

Adapting the notation used in Fourier series, we write / ~ X]fe=i 
where aj = (/, ej) for all j. 

In the next theorem, we provide four equivalent characterizations that 
{cfe} is an orthonormal basis for Ti.. 

Theorem 2.3 The following properties of an orthonormal set {ek}'^—i 
are equivalent. 

(i) Finite linear eombinations of elements in {ck} are dense in Ti. 

(ii) If f ^ H and (/, ej) = 0 for all j, then / = 0. 

(hi) Iff e n, and S^if) = J2k=i where Ok = (/, Cfe), then S'w(/) - 

f as N ^ oo in the norm. 

(iv) If Ok = {f,ek), then ||/||2 = 

Proof. We prove that each property implies the next, with the last 
one implying the first. 

We begin by assuming (i). Given f with {f,ej) = 0 for all j, we 
wish to prove that / = 0. By assumption, there exists a sequence {gn} 
of elements in Ti that are finite linear combinations of elements in {ck}, 
and such that ||/ — (/„|| tends to 0 as n goes to infinity. Since (/, Cj) = 0 
for all j, we must have {f,gn) = 0 for all n; therefore an application of 
the Cauchy- Schwarz inequality gives 

ll/f = (/,/) = (/,/- 5n) < ll/ll 11/ - ffnil for all n. 

Letting n — > oo proves that ||/|p = 0; hence / = 0, and (i) implies (ii). 
Now suppose that (ii) is verified. For f eH we define 

N 

^Nif) = ^ afcCfe, where Ok = (/, ej,), 

fe=i 
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and prove first that SnU) converges to some element g dH. Indeed, 
one notices that the definition of ak implies (/ — SnU)) -L SnU), so 
the Pythagorean theorem and Proposition 2.2 give 

N 

(2) ll/f = 11/ - 5^(/)f + \\SN{f)V = 11/ - 5N(/)f + \a,\\ 

k=l 

Hence ||/|p > letting N tend to infinity we obtain Bessel’s 

inequality 

OO 

k=l 

which implies that the series converges. Therefore, {SN{f)}N=i 

forms a Cauchy sequence in H since 

N 

WS^if) - SM{f)f = ^ lofcp whenever iV > M. 

k=M+l 

Since 7Y is complete, there exists g such that SnU) 
to infinity. 

Fix j, and note that for all sufficiently large N, (/ 

Qj — ttj = 0. Since S'Ar(/) tends to g, we conclude that 

if- 9 ,ej) = 0 for all j. 

Hence f = ghy assumption (ii), and we have proved that / = OfeCfe. 

Now assume that (iii) holds. Observe from (2) that we immediately 
get in the limit as N goes to infinity 

OO 

ii/f = Ei“‘h 

k=l 

Finally, if (iv) holds, then again from (2) we see that ||/ — SNim 
converges to 0. Since each S'iv(/) is a finite linear combination of elements 
in {e/s}, we have completed the circle of implications, and the theorem 
is proved. 

In particular, a closer look at the proof shows that Bessel’s inequality 
holds for any orthonormal family {e/s}. In contrast, the identity 

OO 

ll/r = X]l«fer> where a/s = (/, e/s), 
k=l 


-> 5 as tends 
-S/v(/),e,) = 
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which is called Parseval’s identity, holds if and only if {ek]'^^i is also 
an orthonormal basis. 

Now we turn our attention to the existence of a basis. 

Theorem 2.4 Any Hilbert space has an orthonormal basis. 

The hrst step in the proof of this fact is to recall that (by definition) 
a Hilbert space H is separable. Hence, we may choose a countable col- 
lection of elements J- = {hk\ m. hi so that finite linear combinations of 
elements in T are dense in H. 

We start by recalling a definition already used in the case of finite- 
dimensional vector spaces. Finitely many elements gi, ... ,gN are said to 
be linearly independent if whenever 

oiffi + ■ ■ ■ + ajvffAT = 0 for some complex numbers a*, 

then ai = a 2 = ''' = a 7 v = 0. In other words, no element gj is a lin- 
ear combination of the others. In particular, we note that none of the 
gj can be 0. We say that a countable family of elements is linearly 
independent if all finite subsets of this family are linearly independent. 

If we next successively disregard the elements hk that are linearly 
dependent on the previous elements hi, /i 2 ) • ■ • , hfe-ii then the result- 
ing collection hi = /i, / 2 , • • • , /fe, • • • consists of linearly independent ele- 
ments, whose finite linear combinations are the same as those given by 
hi, h 2 i • • • ) hfe; • • -I and hence these linear combinations are also dense in 

n. 

The proof of the theorem now follows from an application of a familiar 
construction called the Gram-Schmidt process. Given a finite family 
of elements {fi, . . . , fk} we call the span of this family the set of all 
elements which are finite linear combinations of the elements {fi, . . . , fk}. 
We denote the span of {/i, . . . , /fc} by Span({/i, . . . , fk}). 

We now construct a sequence of orthonormal vectors ei,e 2 ,... such 
that Span({ei, . . . , €„}) = Span({/i, . . . , /„}) for all n > I. We do this 
by induction. 

By the linear independence hypothesis, fi ^ 0, so we may take ei = 
/i/||/i||. Next, assume that orthonormal vectors ei,...,efe have been 
found such that Span({ei, . . . , 6^}) = Span({/i, . . . , fk}) for a given k. 
We then try e'kj^i as fk+i + Sj=i To have {e'kj^i, Cj) = 0 requires 
that Qj = —{fk+i,ej), and this choice of aj for 1 < j < k assures that 
s-'k+i is orthogonal to ei, . . . , e^. Moreover our linear independence hy- 
pothesis assures that ^ 0; hence we need only “renormalize” and 
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take Cfe+i = to complete the inductive step. With this we 

have found an orthonormal basis for Ti 

Note that we have implicitly assumed that the number of linearly in- 
dependent elements fi, f 2 , ■■ ■ is infinite. In the case where there are only 
N linearly independent vectors /i, . . . , /^r, then ei, . . . , ew constructed 
in the same way also provide an orthonormal basis for Ti.. These two 
cases are differentiated in the following definition. If is a Hilbert space 
with an orthonormal basis consisting of finitely many elements, then we 
say that H is finite-dimensional. Otherwise H is said to be infinite- 
dimensional. 

2.2 Unitary mappings 

A correspondence between two Hilbert spaces that preserves their struc- 
ture is a unitary transformation. More precisely, suppose we are given 
two Hilbert spaces H and TL' with respective inner products (■,■)« 

and the corresponding norms || ■ Ht.^ and || ■ \\t-c'- A mapping 
U : H ^ H' between these space is called unitary if: 

(i) U is linear, that is, U{af + (3g) = aU{f) -|- (3U{g). 

(ii) U is a bijection. 

(iii) WUfWw = WfWn for all f 

Some observations are in order. First, since U is bijective it must 
have an inverse U~^ : H' ^ H that is also unitary. Part (iii) above also 
implies that if U is unitary, then 

{Uf,Ug)n' = {f,g)n for all /, r? e W. 

To see this, it suffices to “polarize,” that is, to note that for any vector 
space (say over C) with inner product (■, ■) and norm || ■ ||, we have 

F + Gf - ||F - Gf + * (11 f + Gf “ Ilf “ ‘^11") 

whenever F and G are elements of the space. 

The above leads us to say that the two Hilbert spaces H and 7Y' are 
unitarily equivalent or unitarily isomorphic if there exists a unitary 
mapping U : Clearly, unitary isomorphism of Hilbert spaces is 

an equivalence relation. 

With this definition we are now in a position to give precise meaning 
to the statement we made earlier that all infinite-dimensional Hilbert 
spaces are the same and in that sense in disguise. 


(eG) = l 
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Corollary 2.5 Any two infinite- dimensional Hilbert spaees are unitarily 
equivalent. 

Proof. If Ti and Ti' are two infinite-dimensional Hilbert spaces, we 
may select for each an orthonormal basis, say 

{ei, 62, . . .} C 7 Y and {e'i,e 2 , . . .} CH' . 

Then, consider the mapping defined as follows: if / = then 



fe=i 


Clearly, the mapping U is both linear and invertible. Moreover, by Par- 
seval’s identity, we must have 


OO 



k=l 


and the corollary is proved. 

Consequently, all infinite-dimensional Hilbert spaces are unitarily equiv- 
alent to £^(N), and thus, by relabeling, to By similar reasoning 

we also have the following: 

Corollary 2.6 Any two finite-dimensional Hilbert spaees are unitarily 
equivalent if and only if they have the same dimension. 

Thus every finite-dimensional Hilbert space over C (or over R) is equiv- 
alent with C'* (or for some d. 

2.3 Pre-Hilbert spaces 

Although Hilbert spaces arise naturally, one often starts with a pre- 
Hilbert space instead, that is, a space Ho that satisfies all the defining 
properties of a Hilbert space except (v) ; in other words Ho is not assumed 
to be complete. A prime example arose implicitly early in the study of 
Fourier series with the space Ho = TZ of Riemann integrable functions 
on [— 7r,7r] with the usual inner product; we return to this below. Other 
examples appear in the next chapter in the study of the solutions of 
partial differential equations. 


Fortunately, every pre-Hilbert space Ho can be completed. 
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Proposition 2.7 Suppose we are given a pre-Hilbert space Ho with in- 
ner product (■, ■)o. Then we can find a Hilbert space H with inner product 
(■, ■) such that 

(i) Ho c n. 

(ii) if, 9)0 = if, 9) whenever f,g e Ho- 

(iii) Ho is dense in H. 

A Hilbert space satisfying properties like H in the above proposition is 
called a completion of Ho- We shall only sketch the construction of 
H, since it follows closely Cantor’s familiar method of obtaining the real 
numbers as the completion of the rationals in terms of Cauchy sequences 
of rationals. 

Indeed, consider the collection of all Cauchy sequences {/„} with fn G 
Ho, 1 < n < 00 . One defines an equivalence relation in this collection 
by saying that {fn} is equivalent to {/^} if /„ — converges to 0 as 
n — > 00 . The collection of equivalence classes is then taken to be H. One 
then easily verifies that H inherits the structure of a vector space, with 
an inner product {f,g) defined as limn^oo(/n, ffn), where {fn} and {gn} 
are Cauchy sequences in Ho, representing, respectively, the elements / 
and g in H. Next, if / G Ho we take the sequence {fn}, with fn = f for 
all n, to represent / as an element of H, giving Ho C H. To see that 
H is complete, let be a Cauchy sequence in H, with each 

represented by ff G Tfo- If we define F e H as represented by 

the sequence {fn} with /„ = where N{n) is so that - ff\ < 

I/n for j > N{n), then we note that F^ —)■ F in H. 

One can also observe that the completion H of Ho is unique up to 
isomorphism. (See Exercise 14.) 

3 Fourier series and Fatou’s theorem 

We have already seen an interesting relation between Hilbert spaces and 
some elementary facts about Fourier series. Here we want to pursue this 
idea and also connect it with complex analysis. 

When considering Fourier series, it is natural to begin by turning to 
the broader class of all integrable functions on [— tt, tt]. Indeed, note that 
Z/^([— TT, tt]) C tt, tt]), by the Cauchy-Schwarz inequality, since the 

interval [— tt, tt] has finite measure. Thus, if / G L^([— tt, tt]) and n G Z, 
we define the Fourier coefficient of / by 
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The Fourier series of / is then formally Yl'^=-oo and we write 


CXD 

/(x)~ 

n= — oo 


to indicate that the sum on the right is the Fourier series of the func- 
tion on the left. The theory developed thus far provides the natural 
generalization of some earlier results obtained in Book I. 

Theorem 3.1 Suppose f is integrable on [— 7r,7r]. 

(i) If Qn = 0 for all n, then f{x) = 0 for a.e. x. 

(ii) tends to f{x) for a.e. x, as r — > 1, r < 1. 

The second conclusion is the almost everywhere “Abel summability” to 
/ of its Fourier series. Note that since \an\ < ^ \f{x) \ dx, the series 

converges absolutely and uniformly for each r, 0 < r < 1. 

Proof. The first conclusion is an immediate consequence of the second. 
To prove the latter we recall the identity 


E 




n= — OO 


1—^2 

1 — 2r cos ?/ + 


for the Poisson kernel; see Book I, Chapter 2. Starting with our given 
/ e L^([— TT, tt]) we extend it as a function on R by making it periodic of 
period 27r.^ We then claim that for every x 


( 3 ) 




n= — OO 



y)Pr{y) dy. 


Indeed, by the dominated convergence theorem the right-hand side equals 

/ fi^-vV^dy. 

J — TT 


Moreover, for each x and n 



y)e^^ydy= f f{y)e^ 

J —TT-\-X 


= e 


in(x-y) 

[ f{y)e-^^ydy = P^^2nan. 

J —IT 


^Note that we may without loss of generality assume that /(tt) = /(— tt) so as to make 
the periodic extension unambiguous. 
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The first equality follows by translation invariance (see Section 3, Chap- 
ter 2), and the second since F{y) dy = fj F{y) dy whenever F is peri- 
odic of period 27r and / is an interval of length 27r (Exercise 3, Chapter 2). 
With these observations, the identity (3) is established. We can now in- 
voke the facts about approximations to the identity (Theorem 2.1 and 
Example 4, Chapter 3) to conclude that the left-hand side of (3) tends to 
f{x) at every point of the Lebesgue set of /, hence almost everywhere. 
(To be correct, the hypotheses of the theorem require that / be integrable 
on all of M. We can achieve this for our periodic function by setting / 
equal to zero outside [— 27r, 27r], and then (3) still holds for this modified 
/, whenever x G [— Tj-tt].) 

We return to the more restrictive setting of We express the essen- 
tial conclusions of Theorem 2.3 in the context of Fourier series. With 
/ G T^([— TT, tt]), we write as before ^ /(x)e“*”* dx. 

Theorem 3.2 Suppose f ^ LP‘{[—'k,'k\). Then: 

(i) We have Parseval’s relation 

CXD „7i- 

r} — — i-v~i 'F 


(ii) The mapping f i— > {on} is a unitary correspondence between 
Z/^([— TT, tt]) andF{Z). 

(hi) The Fourier series of f converges to f in the F^-norm, that is, 
\f{x) - SN{f){x)\‘^ dx ^ 0 as N ^ CO, 
where SN{f) = E|n|<A 

To apply the previous results, we let = L^([— tt, tt]) with inner prod- 
uct {f,g) = ^ f(x)g(x) dx, and take the orthonormal set 
to be the exponentials with k = 1 when n = 0, k = 2n for 

n > 0, and k = 2|n| — 1 for n < 0. 

By the previous result, assertion (ii) of Theorem 2.3 holds and thus 
all the other conclusions hold. We therefore have Parseval’s relation, 
and from (iv) we conclude that ||/ — 5'jv(/)P = X]|n|>A — > 0 as 

— > oo. Similarly, if {a„} G Fifl) is given, then ||S'Ar(/) — 5 'm(/)IP 
0, as N,M — > oo. Hence the completeness of guarantees that there is 
an / G such that ||/ — S'Ar(/)|| — ^ 0, and one verifies directly that / 
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has {an} as its Fourier coefficients. Thus we deduce that the mapping 
/ H- > [an] is onto and hence unitary. This is a key conclusion that holds 
in the setting on and was not valid in an earlier context of Riemann 
integrable functions. In fact the space TZ of such functions on [— tt, tt] is 
not complete in the norm, containing as it does the continuous functions, 
but TZ is itself restricted to bounded functions. 

3.1 Fatou’s theorem 

Fatou’s theorem is a remarkable result in complex analysis. Its proof 
combines elements of Hilbert spaces, Fourier series, and deeper ideas of 
differentiation theory, and yet none of these notions appear in its state- 
ment. The question that Fatou’s theorem answers may be put simply as 
follows. 

Suppose F{z) is holomorphic in the unit disc P = {z € C : 

\z\ < 1}. What are conditions on F that guarantee that F{z) 
will converge, in an appropriate sense, to boundary values 
F{e^^) on the unit circle? 

In general a holomorphic function in the unit disc can behave quite 
erratically near the boundary. It turns out, however, that imposing a 
simple boundedness condition is enough to obtain a strong conclusion. 

If F is a function defined in the unit disc P, we say that F has a radial 
limit at the point —tt < 9 < tt on the circle, if the limit 

lim F{rF‘^) 

r — ► 1 
r < 1 


exists. 


Theorem 3.3 A bounded holomorphic function F{re^^) on the unit disc 
has radial limits at almost every 6. 

Proof We know that F{z) has a power series expansion in 

P that converges absolutely and uniformly whenever z = re*® and r < 1. 
In fact, for r < 1 the series is the Fourier series of the 

function F(re*®), that is. 


OnX 


1 



when n > 0, 


and the integral vanishes when n < 0. (See also Chapter 3, Section 7 in 
Book II). 
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We pick M so that |-F(2;)| < M, for all z € ID. By Parseval’s identity 


OO 


n=0 


- \F{re^<^)\^d9 


for each 0 < r < 1. 


Letting r — > 1 one sees that |a„p converges (and is < M^). We now let 
F(e*®) be the L^-function whose Fourier coefficients are an when n > 0, 
and 0 when n < 0. Hence by conclusion (ii) in Theorem 3.1 


anr'^F^^ — > F(e*®), for a.e 0, 

n=0 


concluding the proof of the theorem. 

If we examine the argument given above we see that the same conclu- 
sion holds for a larger class of functions. In this connection, we define 
the Hardy space iL^(D) to consist of all holomorphic functions F on 
the unit disc ID) that satisfy 


sup 

0<r<l 


1 

Xr 



|F(re*®)p dO < oo. 


We also define the “norm” for functions F in this class, to be 

the square root of the above quantity. 

One notes that if F is bounded, then F G iL^(D), and moreover the 
conclusion of the existence of radial limits almost everywhere holds for 
any F G 77^(0), by the same argument given for the bounded case.'^ Fi- 
nally, one notes that F G 77^(0) if and only if F{z) = 

Z]r=o moreover, X^X=o = ll-^llH 2 (n)). This states in par- 

ticular that 77"^ (D) is in fact a Hilbert space that can be viewed as the 
“subspace” 7^(Z+) of ^^(Z), consisting of all {a„} G 7^(Z), with a„ = 0 
when n < 0. 

Some general considerations of subspaces and their concomitant or- 
thogonal projections will be taken up next. 


4 Closed subspaces and orthogonal projections 

A linear subspace S (or simply subspace) of 77 is a subset of 77 that 
satisfies af + (3g G S whenever f,g and a,/? are scalars. In other 
words, S is also a vector space. For example in lines passing through 


'^An even more general statement is given in Problem 5*. 
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the origin and planes passing through the origin are the one- dimensional 
and two-dimensional subspaces, respectively. 

The subspace S is closed if whenever {/„} C S converges to some 
f then / also belongs to S. In the case of finite-dimensional Hilbert 
spaces, every subspace is closed. This is, however, not true in the gen- 
eral case of infinite-dimensional Hilbert spaces. For instance, as we 
have already indicated, the subspace of Riemann integrable functions 
in TT, tt]) is not closed, nor is the subspace obtained by fixing a ba- 
sis and taking all vectors that are finite linear combinations of these basis 
elements. It is useful to note that every closed subspace <S of is itself a 
Hilbert space, with the inner product on S that which is inherited from 
H. (For the separability of S, see Exercise 11.) 

Next, we show that a closed subspace enjoys an important character- 
istic property of Euclidean geometry. 

Lemma 4.1 Suppose S is a closed subspace ofH and f a H. Then: 

(i) There exists a (unique) element go S which is closest to f , in the 
sense that 


11 / -50 


infJI/-5ll 

geS 


(ii) The element f — go is perpendicular to S, that is, 
(/-5o,5) = 0 for aligns. 


The situation in the lemma can be visualized as in Figure 1. 

/ 



Figure 1. Nearest element to / in 5 
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Proof. If / G (S, then we choose f = go, and there is nothing left 
to prove. Otherwise, we let d = infgg^ ||/ ~ 5||i and note that we must 
have d > 0 since f ^ S and S is closed. Consider a sequence {gn]^=i in 
S such that 


ll/“ 5 n||— as n — t oo. 

We claim that {gn} is a Cauchy sequence whose limit will be the desired 
element go- In fact, it would suffice to show that a subsequence of {gn} 
converges, and this is immediate in the finite-dimensional case because 
a closed ball is compact. However, in general this compactness fails, as 
we shall see in Section 6, and so a more intricate argument is needed at 
this point. 

To prove our claim, we use the parallelogram law, which states that 
in a Hilbert space H 

(4) \\A+ Bf +\\A- Bf = 2\\\Af +\\Bf] for all H, H G W. 

The simple verification of this equality, which consists of writing each 
norm in terms of the inner product, is left to the reader. Putting A = 
f — gn and B = f — gm in the parallelogram law, we find 

l|2/ — {gn + 5m)ir + Wdm “ 5n|P = 2 [|| / — 5n |P + || / “ dmlT] • 

However 5 is a subspace, so the quantity + gm) belongs to S, hence 
l|2/ - {gn + gm)\\ = 2||/ - ^{gn + £Im)|| > 2d. 

Therefore 

\\gm - 5nf = 2 [11/ - gnf + \\f “ 5m f] “ ||2/ - {gn + 5m)f 

< 2 [11/ - 5nf + 11/ - 5mf ] - 4d^ 

By construction, we know that || / — 5n || — > d and || / — 5m || — > d as u, m — 
oo, so the above inequality implies that [gn] is a Cauchy sequence. Since 
H is complete and S closed, the sequence {gn} must have a limit go in 
S, and then it satisfies d = ||/ — 5o||- 

We prove that if 5 G 5, then g ± {f — go). For each e (positive or neg- 
ative), consider the perturbation of go defined by go — eg. This element 
belongs to <S, hence 


ll/-(5o-e5)f >||/-5of 
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Since ||/ - {go - eg)\\'^ = \\f - 50IP + + 2eRe(/ - go,g), we find 

that 

(5) 2eRe(/-ffo,5') + e^||5f > 0. 

If Ke{f — go, g) < 0, then taking e small and positive contradicts (5). 
If Re(/ — go, g) > 0, a contradiction also follows by taking e small and 
negative. Thus Re(/ — go,g) = 0. By considering the perturbation go — 
ieg, a similar argument gives Im(/ — go,g) = 0, and hence (/ — go,g) = 

0. 

Finally, the uniqueness of go follows from the above observation about 
orthogonality. Suppose go is another point in S that minimizes the 
distance to /. By taking g = go — 9o in our last argument we find 
{f — go) -L {go — go), and the Pythagorean theorem gives 

ll/-5of =||/-5of +||ffo-9of. 

Since by assumption ||/ — = 11/ “ 9o|Pi we conclude that ||(?o — 5o|| = 

0, as desired. 

Using the lemma, we may now introduce a useful concept that is an- 
other expression of the notion of orthogonality. If <S is a subspace of a 
Hilbert space H, we define the orthogonal complement of S by 

= {f eU: {f,g) = 0 for all g e 5}. 

Clearly, is also a subspace of H, and moreover S n = {0}. To see 
this, note that if / G 5 D iS-*-, then / must be orthogonal to itself; thus 
0 = (/>/) = ll/ll> and therefore / = 0. Moreover, is itself a closed 
subspace. Indeed, if fn^ f, then {fn,g) ^ {f,g) for every g, by the 
Cauchy-Schwarz inequality. Hence if {fn,g) = 0 for all 5 G <S and all n, 
then {f,g) = 0 for all those g. 

Proposition 4.2 If S is a closed subspace of a Hilbert space H, then 

n = S(BS^. 

The notation in the proposition means that every / G can be written 
uniquely as f = g + h, where g H S and h G we say that H is the 
direct sum of S and S^. This is equivalent to saying that any / in 77 
is the sum of two elements, one in S, the other in , and that S H 
contains only 0. 

The proof of the proposition relies on the previous lemma giving the 
closest element of / in <S. In fact, for any / G 77, we choose go as in the 
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lemma and write 


/ = 50 + (/ - 5o)- 

By construction go & S, and the lemma implies f — go & S^, and this 
shows that / is the sum of an element in <S and one in . To prove that 
this decomposition is unique, suppose that 

f = g + h = g + h where g,g & S and h,h & S^. 

Then, we must have g — g = h — h. Since the left-hand side belongs to 
S while the right-hand side belongs to <S^ the fact that S D <S^ = {0} 
implies g — g = 0 and h — h = 0. Therefore g = g and h = h and the 
uniqueness is established. 

With the decomposition H = S ® S-^ one has the natural projection 
onto S defined by 

Ps{f) = 9, where f = g + h and g G S, h G S^. 

The mapping Ps is called the orthogonal projection onto S and sat- 
isfies the following simple properties: 

(i) / Ps{f) is linear, 

(ii) Psif) = f whenever / € 5, 

(in) Ps{f) = 0 whenever / € <S-*-, 

(iv) ||B5(/)|| <11/11 for all f e H. 

Property (i) means that Ps{afi + /9/2) = oiPs{fi) -f PPs{f 2 ), whenever 
/i) /2 and a and /3 are scalars. 

It will be useful to observe the following. Suppose {cfe} is a (finite 
or infinite) collection of orthonormal vectors in Ti. Then the orthogonal 
projection P in the closure of the subspace spanned by {ck} is given by 
P{f) = ^k)&k- In case the collection is infinite, the sum converges 

in the norm of Ti. 

We illustrate this with two examples that arise in Fourier analysis. 

Example 1. On L^([— tt, tt]), recall that if f{9) ^ then 

the partial sums of the Fourier series are 

N 

SN{fm= Y. 

n=—N 
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Therefore, the partial sum operator Sn consists of the projection onto 
the closed subspace spanned by {e_Ar, . . . , CAr}. 

The sum Sn can be realized as a convolution 

= Dn{ 0 - ip)f{(p)dcp, 

where Df^{6) = sin((A^ + sin(0/2) is the Dirichlet kernel. 

Example 2. Once again, consider L^([— 7 r, 7 r]) and let S denote the 
subspace that consists of all F G T^([— 7 r, 7 r]) with 


n=0 


In other words, S is the space of square integrable functions whose 
Fourier coefficients a„ vanish for u < 0. From the proof of Fatou’s theo- 
rem, this implies that S can be identified with the Hardy space 
where E) is the unit disc, and so is a closed subspace unitarily isomorphic 
to Therefore, using this identification, if P denotes the orthogo- 

nal projection from T^([— 7 r, 7 r]) to S, we may also write P{f){z) for the 
element corresponding to that is. 


= £ 


OO 

= ^ar,.z'~. 
n=0 


Given / G Z/^([— tt, tt]), we define the Cauchy integral of / by 




where 7 denotes the unit circle and z belongs to the unit disc. Then we 
have the identity 


P{f){z) = C{f){z), for all z G D. 

Indeed, since / G it follows by the Cauchy- Schwarz inequality that 
/ G L^([— TT, tt]), and therefore we may interchange the sum and integral 


180 


Chapter 4. HILBERT SPACES: AN INTRODUCTION 


in the following calculation (recall \z\ < 1): 

OO OO / p-j] 

n — n rt — n ' ‘ 




-j /•TT ^ 

/ /(e") 

n=0 


2tt 

1 r 
1 r 




/(e" 


1 — e 
/(e*® 


-d(9 




5 Linear transformations 

The focus of analysis in Hilbert spaces is largely the study of their lin- 
ear transformations. We have already encountered two classes of such 
transformations, the unitary mappings and the orthogonal projections. 
There are two other important classes we shall deal with in this chapter 
in some detail: the “linear functionals” and the “compact operators,” 
and in particular those that are symmetric. 

Suppose Hi and H2 are two Hilbert spaces. A mapping T : Hi H2 
is a linear transformation (also called linear operator or operator) 
if 


T{af + bg) = aT{f) + bT{g) for all scalars a, b and f,g^Hi. 

Clearly, linear operators satisfy T(0) = 0. 

We shall say that a linear operator T : Hi — > H 2 is bounded if there 
exists M > 0 so that 

( 6 ) \\T{f)\\n.<M\\f\\n,. 

The norm of T is denoted by ||T|| 7 ^^^ 7.^2 or simply ||T|| and defined by 

||T|| = infM, 

where the infimum is taken over all M so that (6) holds. A trivial example 
is given by the identity operator /, with /(/) = /. It is of course a 
unitary operator and a projection, with ||/|| = 1. 
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In what follows we shall generally drop the subscripts attached to the 
norms of elements of a Hilbert space, when this causes no confusion. 

Lemma 5.1 ||T|| = sup{|(T/, (/)| : ||/|| < 1, ||(/|| < 1}, where of eourse 
f £ Hi and g £ H 2 - 

Proof. If ||T|| < M, the Cauchy-Schwarz inequality gives 

|(T/, <7)1 < M whenever ||/|| < 1 and H^H < 1; 

thus sup{|(r/,5)| : ll/ll < I, llffll < 1} < ||T||. 

Conversely, if sup{|(T/, 5 )| : ||/|| < I, H^H < 1} < M, we claim that 
\\Tf\\ < M||/|| for all /. If / or Tf is zero, there is nothing to prove. 
Otherwise, /' = //||/|| and g' = Tf/\\Tf\\ have norm I, so by assump- 
tion 

\{Tf',g')\<M. 

But since \{Tf',g')\ = ||r/||/||/|| this gives ||r/|| < M||/||, and the 
lemma is proved. 

A linear transformation T is continuous if T{fn) — > T{f) whenever 
fn ^ f ■ Clearly, linearity implies that T is continuous on all of T^i if 
and only if it is continuous at the origin. In fact, the conditions of being 
bounded or continuous are equivalent. 

Proposition 5.2 A linear operator T : Hi — > H 2 is bounded if and only 
if it is eontinuous. 

Proof If T is bounded, then ||T(/) - T{fn)\\n 2 < ^11/ - /n||wi, 
hence T is continuous. Conversely, suppose that T is continuous but 
not bounded. Then for each n there exists fn¥=^ such that ||T(/„)|| > 
n||/„||. The element Pn = fn/{n\\fn\\) has norm 1/n, hence ffn ^ 0. 
Since T is continuous at 0, we must have T{gn) — > 0, which contradicts 
the fact that ||T(gr„)|| > 1. This proves the proposition. 

In the rest of this chapter we shall assume that all linear operators are 
bounded, hence continuous. It is noteworthy to recall that any linear 
operator between finite-dimensional Hilbert spaces is necessarily contin- 
uous. 

5.1 Linear functionals and the Riesz representation theorem 

A linear functional f is a linear transformation from a Hilbert space 
H to the underlying field of scalars, which we may assume to be the 
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complex numbers, 


Of course, we view C as a Hilbert space equipped with its standard norm, 
the absolute value. 

A natural example of a linear functional is provided by the inner prod- 
uct on H. Indeed, for fixed g a H, the map 

is linear, and also bounded by the Cauchy-Schwarz inequality. Indeed, 
\{f,g)\ < M\\f\\, where M = ||g||. 

Moreover, i{g) = Mjl^H so we have ||£|| = Hg'H. The remarkable fact is 
that this example is exhaustive, in the sense that every continuous linear 
functional on a Hilbert space arises as an inner product. This is the so- 
called Riesz representation theorem. 

Theorem 5.3 Let £ be a continuous linear functional on a Hilbert space 
H. Then, there exists a unique g (zH sueh that 

= for all fen. 


Moreover, pH = H^lj. 

Proof. Consider the subspace of H defined by 


S = {f en-.£{f) = 0}. 

Since £ is continuous the subspace S, which is called the null-space of £, 
is closed. If 5 = 7Y, then £ = £) and we take g = 0. Otherwise is non- 
trivial and we may pick any h e with ||/i|| = 1. With this choice of h 
we determine g by setting g = £{h)h. Thus if we let u = £{f)h — £{h)f, 
then u e S, and therefore {u,h) = 0. Hence 

0 = {£{f)h - £{h)f, h) = £{f){h, h) - if, J(h)h)- 

Since (h, h) = 1, we find that £{f) = {f,g) as desired. 

At this stage we record the following remark for later use. Let Hq 
be a pre-Hilbert space whose completion is H. Suppose £q is a linear 
functional on Hq which is bounded, that is, \£o{f)\ < Tf||/|| for all / e 
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7^0- Then £o has an extension ^ to a bounded linear functional on 7Y, 
with \i{f)\ < -/W||/|| for all f This extension is also unique. To see 
this, one merely notes that {£o{fn)} is a Cauchy sequence whenever the 
vectors {/n} belong to Ho, and fn^f£n.H, as n — > oo. Thus we may 
define i{f) as limn^oo £o{fn)- The verification of the asserted properties 
of i is then immediate. (This result is a special case of the extension 
Lemma 1.3 in the next chapter.) 

5.2 Adjoints 

The first application of the Riesz representation theorem is to determine 
the existence of the “adjoint” of a linear transformation. 

Proposition 5.4 Let T : H ^ H be a hounded linear transformation. 
There exists a unique bounded linear transformation T* on H so that: 

(i) {Tf,g) = {f,T*g), 

(ii) ||T|| = ||T*||, 

(in) (T*)* = T. 

The linear operator T* : H ^ H satisfying the above conditions is called 
the adjoint of T. 

To prove the existence of an operator satisfying (i) above, we observe 
that for each fixed g &H, the linear functional £ = (g, defined by 

£{f) = {Tf,g), 

is bounded. Indeed, since T is bounded one has ||r/|| <M||/||; hence 
the Cauchy- Schwarz inequality implies that 

K(/)|<||T/|| ||5||<i3||/||, 

where B = M||( 7 ||. Consequently, the Riesz representation theorem guar- 
antees the existence of a unique h ^ H, h = hg, such that 

m = {f,h). 

Then we define T*g = h, and note that the association T* : g h is 
linear and satisfies (i). 

The fact that ||T|| = ||r*|| follows at once from (i) and Lemma 5.1: 

||T|| = sup{|(T/,5r)| : ll/ll < 1, ll^ll < 1} 

= sup{|(/,T*3)| : ll/ll < 1, ll^ll < 1} 

1 1 1 1 
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To prove (iii), note that {Tf,g) = {f,T*g) for all / and g if and only 
if {T*f,g) = {f,Tg) for all / and g, as one can see by taking complex 
conjugates and reversing the roles of / and g. 

We record here a few additional remarks. 

(a) In the special case when T = T* (we say that T is symmetric), then 

(7) ||r||=sup{|(T/,/)|: 11/11 = 1}. 

This should be compared to Lemma 5.1, which holds for any linear oper- 
ator. To establish (7), let M = sup{|(T/, /)| : ||/|| = 1}. By Lemma 5.1 
it is clear that M < ||T||. Conversely, if / and g belong on H, then one 
has the following “polarization” identity which is easy to verify 

(T/, g)=^ [{T{f + g)J + g)- {T{f -g),f-g) 

+ i {T{f + ig), f + ig) -i {T{f - ig), f - ig)]. 

For any /i S the quantity {Th,h) is real, because T = T* , hence 
{Th,h) = {h,T*h) = {h,Th) = {Th,h). Consequently 

Re(r/, 5) = I [{T{f + g)J + g)- {T{f -g)J-g)]. 

Now \{Th,h)\ < so |Re(r/, 5 )| < ^ [\\f + g\\^ + \\f - g\\^], and 

an application of the parallelogram law (4) then implies 

|Re(T/,ff)|<^[||/f +||gf]. 

So if ll/ll < 1 and H^H < 1, then \Re{Tf,g)\ < M. In general, we may 
replace g by e^^g in the last inequality to find that whenever ||/|| < 1 and 
ll^ll < 1, then \{Tf,g)\ < M, and invoking Lemma 5.1 once again gives 
the result, ||r|| < M. 

(b) Let us note that if T and S are bounded linear transformations of Ti to 
itself, then so is their product TS, defined by {TS){f) = T{S{f)). More- 
over we have automatically (TS)* = S*T*; in fact, {TSf, g) = {Sf, T*g) = 
U,S*T*g). 

(c) One can also exhibit a natural connection between linear transforma- 
tions on a Hilbert space and their associated bilinear forms. Suppose first 
that T is a bounded operator in Ti.. Define the corresponding bilinear 
form B by 


( 8 ) 


B{f,g) = {Tf,g). 
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Note that B is linear in / and conjugate linear in g. Also by the Cauchy- 
Schwarz inequality \B{f,g)\ < M||/|| ||g||, where M = ||r||. Conversely if 
B is linear in /, conjugate linear in g and satisfies \B{f,g)\ < M||/|| ||(jr||, 
there is a unique linear transformation so that (8) holds with M = ||T||. 
This can be proved by the argument of Proposition 5.4; the details are 
left to the reader. 


5.3 Examples 

Having presented the elementary facts about Hilbert spaces, we now 
digress to describe briefly the background of some of the early develop- 
ments of the theory. A motivating problem of considerable interest was 
that of the study of the “eigenfunction expansion” of a differential oper- 
ator L. A particular case, that of a Sturm-Liouville operator, arises on 
an interval [a, b] of M with L defined by 



q{x), 


where g is a given real- valued function. The question is then that of 
expanding an “arbitrary” function in terms of the eigenfunctions (/?, that 
is those functions that satisfy L{ip) = gip for some /i G M. The classi- 
cal example of this is that of Fourier series, where L = dP /dx^ on the 
interval [— tt, tt] with each exponential e*”® an eigenfunction of L with 
eigenvalue /i = —rp. 

When made precise in the “regular” case, the problem for L can be 
resolved by considering an associated “integral operator” T defined on 
L'^{[a,b]) by 

T{f){x)= [ K{x,y)f{y)dy, 

J a 


with the property that for suitable /, 


LT{f) = f. 


It turns out that a key feature that makes the study of T tractable is 
a certain compactness it enjoys. We now pass to the definitions and 
elaboration of some of these ideas, and begin by giving two relevant 
illustrations of classes of operators on Hilbert spaces. 


Infinite diagonal matrix 

Suppose is an orthonormal basis of Ti. Then, a linear transfor- 
mation T said to be diagonalized with respect to the basis 
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Wk] if 


T{(pk) = ^k‘fk, where Afe € C for all k. 


In general, a non-zero element tp is called an eigenvector of T with 
eigenvalue A if Tip = Xp. So the pk above are eigenvectors of T, and 
the numbers Afc are the corresponding eigenvalues. 

So if 



fe=i 


k=l 


The sequence {Afc} is called the multiplier sequence corresponding to 
T. 

In this case, one can easily verify the following facts: 

• ll^ll = supfe |Afc|. 

• T* corresponds to the sequence {Afe}; hence T = T* if and only if 
the Afe are real. 

• T is unitary if and only if |Afc| = 1 for all k. 

• T is an orthogonal projection if and only if Afc = 0 or 1 for all k. 

As a particular example, consider H = and assume that 

every / G tt, tt]) is extended to R by periodicity, so that f{x + 

27r) = f{x) for all a: G R. Let pk{x) = for /c G Z. For a fixed h G R 
the operator Uh defined by 


Uh{f){x) = f{x + h) 


is unitary with Afc = Hence 


OO CXD 


UhU)-^ Y, «feAfce''=" if Y 


k= — oo k= — oc 


Integral operators, and in particular, Hilbert-Schmidt 
operators 

Let H = If we can define an operator T : > 7Y by the formula 


T{f){x) = K{x,y)f{y)dy whenever / G //^(R*^) 
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we say that the operator T is an integral operator and K is its asso- 
ciated kernel. 

In fact, it was the problem of invertibility related to such operators, 
and more precisely the question of solvability of the equation f — T f = g 
for given g, that initiated the study of Hilbert spaces. These equations 
were then called “integral equations.” 

In general a bounded linear transformation cannot be expressed as an 
(absolutely convergent) integral operator. However, there is an inter- 
esting class for which this is possible and which has a number of other 
worthwhile properties: Hilbert-Schmidt operators, those with a ker- 
nel K that belongs to x 

Proposition 5.5 Let T be a Hilbert- Sehmidt operator on Z/^(IR.‘^) with 
kernel K . 

(i) Iff G then for almost every X the function y K{x,y)f{y) 

is integrable. 

(ii) The operator T is bounded from L'^ to itself, and 


||T|| < ||7f||^2(HdxRd) 


where ||i^||^ 2 (RdxRd) is the If -norm of K on K.'* x 

(hi) The adjoint T* has kernel K{y,x). 

Proof. By Fubini’s theorem we know that for almost every x, the 
function y i— > \K{x, y)p is integrable. Then, part (i) follows directly from 
an application of the Cauchy-Schwarz inequality. 

For (ii), we make use again of the Cauchy-Schwarz inequality as follows 



Therefore, squaring this and integrating in x yields 



Finally, part (hi) follows by writing out {Tf,g) in terms of a double 
integral, and then interchanging the order of integration, as is permissible 
by Fubini’s theorem. 
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Hilbert-Schmidt operators can be defined analogously for the Hilbert 
space where E is a measurable subset of We leave it to the 

reader to formulate an prove the analogue of Proposition 5.5 that holds 
in this case. 

Hilbert-Schmidt operators enjoy another important property: they are 
compact. We will now discuss this feature in more detail. 

6 Compact operators 

We shall use the notion of sequential compactness in a Hilbert space H: 
a set X C is compact if for every sequence {fn} in X, there exists a 
subsequence {fuk} converges in the norm to an element in X. 

Let H denote a Hilbert space, and B the closed unit ball in H, 

B = {fen-.\\f\\<i}. 

A well-known result in elementary real analysis says that in a finite- 
dimensional Euclidean space, a closed and bounded set is compact. How- 
ever, this does not carry over to the infinite-dimensional case. The fact 
is that in this case the unit ball, while closed and bounded, is not com- 
pact. To see this, consider the sequence {/«} = {e„}, where the e„ are 
orthonormal. By the Pythagorean theorem, ||e„ — CmlP = 2 if n 7 ^ m, so 
no subsequence of the {e„} can converge. 

In the infinite-dimensional case we say that a linear operator T : Ti. —>■ 
H is compact if the closure of 

T{B) = {gen-.g = T{f) for some / e H} 

is a compact set. Equivalently, an operator T is compact if, whenever 
[fk] is a bounded sequence in 7Y, there exists a subsequence {/n*,} so 
that T fn^. converges. Note that a compact operator is automatically 
bounded. 

Note that by what has been said, a linear transformation is in general 
not compact (take for instance the identity operator!). However, if T is 
of finite rank, which means that its range is finite-dimensional, then 
it is automatically compact. It turns out that dealing with compact 
operators provides us with the closest analogy to the usual theorems of 
(finite-dimensional) linear algebra. Some relevant analytic properties of 
compact operators are given by the proposition below. 

Proposition 6.1 Suppose T is a bounded linear operator on Ti. 
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(i) If S is compact on H, then ST and TS are also compact. 

(ii) If {Tn} is a family of compact linear operators with ||T„ — T|| ^ 0 
as n tends to infinity, then T is compact. 

(iii) Conversely, ifT is compact, there is a sequence {Tn} of operators 
of finite rank such that ||T„ — r|| — > 0. 

(iv) T is compact if and only if T* is compact. 

Proof. Part (i) is immediate. For part (ii) we use a diagonalization 
argument. Suppose [fk] is a bounded sequence in H. Since Ti is com- 
pact, we may extract a subsequence {/i,fe}^i of [fk] such that T'i(/i,fe) 
converges. From we may find a subsequence such that 

^ 2 (/ 2 ,fe) converges, and so on. If we let gk = fk,k, then we claim {T{gk)} 
is a Cauchy sequence. We have 

\\T{gk) - T{ge)\\ < \\T{gk) - T„,{gk)\\ + ||Tm(fffe) - T„,{g(>)\\ + 

+ \\Tmigi) - T{gi)\\. 

Since \\T — T^W — > 0 and {< 7 ^} is bounded, we can make the first and 
last term each < e/3 for some large m independent of k and i. With this 
fixed m, we note that by construction \\Tjn{gk) — Tm{ge)\\ < e/3 for all 
large k and I. This proves our claim; hence {T{gk)} converges in H. 

To prove (iii) let be a basis of H and let be the orthogonal 

projection on the subspace spanned by the Ck with k > n. Then clearly 
Qn{g) ~ '^k>n ^kGk whenever g ~ <^kek, and HQn^lP is a decreas- 

ing sequence that tends to 0 as n — > 00 for any g ^H. We claim that 
||QnT|| — > 0 as n — > 00 . If not, there is a c > 0 so that IIQraTH > c, and 
hence for each n we can find /„, with ||/n|| = 1 so that IIQraT/nll > c. 
Now by compactness of T, choosing an appropriate subsequence {/n*}, 
we have T/n^ g for some g. But Qn^id) = Qn^T fn^ - Qn^iTfu^ - 9 ), 
and hence we conclude that ||Qnfe( 5 ')|| > c/2, for large k. This contradic- 
tion shows that ||(5nT|| — > 0. So if Pn is the complementary projection 
on the finite-dimensional space spanned by ei, ..., Cn, / = Pn + Qn, then 
||QnT|| — ^ 0 means that \\PnT — T\\ — > 0. Since each PnT is of finite rank, 
assertion (iii) is established. 

Finally, if T is compact the fact that \\PnT — r|| — > 0 implies ||T*P„ — 
T*|| — > 0, and clearly T*P„ is again of finite rank. Thus we need only 
appeal to the second conclusion to prove the last. 

We now state two further observations about compact operators. 
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• If T can be diagonalized with respect to some basis {^pk} of eigen- 
vectors and corresponding eigenvalues {Xk}, then T is compact if 
and only if |Afe| ^0. See Exercise 25. 

• Every Hilbert-Schmidt operator is compact. 

To prove the second point, recall that a Hilbert-Schmidt operator is 
given on by 



K{x,y)f{y)dy, where K e x 


If {ipk}'^^i denotes an orthonormal basis for then the collection 

{ipk{x)ipi{y)}k,e>i is an orthonormal basis for x the proof of 

this simple fact is outlined in Exercise 7. As a result 


OO 




k,i=l 


We define an operator 



Tnf{x)= Kn{x,y)f{y)dy, whexe Kn{x,y) = YTk,i=i^kip^k{x)ipi{y). 


Then, each has finite-dimensional range, hence is compact. Moreover, 



as n — > OO. 


fc > 


£ > 


By Proposition 5.5, \\T — Tnll < \\K — Ar„||j;, 2 (RdxRci), so we can conclude 
the proof that T is compact by appealing to Proposition 6.1. 

The climax of our efforts regarding compact operators is the infinite- 
dimensional version of the familiar diagonalization theorem in linear al- 
gebra for symmetric matrices. Using a similar terminology, we say that 
a bounded linear operator T is symmetric if T* = T. (These operators 
are also called “self-adjoint” or “Hermitian.” ) 

Theorem 6.2 (Spectral theorem) Suppose T is a compact symmet- 
ric operator on a Hilbert space Ti. Then there exists an (orthonormal) 
basis ofH that consists of eigenvectors ofT. Moreover, if 


T(pk — XkP^k 


then Afe G R and A^ — > 0 as fc 


OO. 
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Conversely, every operator of the above form is compact and symmetric. 
The collection {Xk} is called the spectrum of T. 

Lemma 6.3 Suppose T is a bounded symmetric linear operator on a 
Hilbert space Ti. 

(i) If A is an eigenvalue of T, then A is real. 

(ii) If /i and f 2 are eigenvectors corresponding to two distinct eigen- 
values, then /i and /2 are orthogonal. 

Proof. To prove (i), we first choose a non- zero eigenvector / such 
that T(/) = A/. Since T is symmetric (that is, T = T*), we find that 

A(/, /) = (T/, /) = (/, Tf) = if, A/) = A(/, /), 

where we have used in the last equality the fact that the inner product is 
conjugate linear in the second variable. Since / 0, we must have A = A 

and hence A G R. 

For (ii), suppose fi and /2 have eigenvalues Ai and A 2 , respectively. 
By the previous argument both Ai and A 2 are real, and we note that 

Al(/l, /2) = (Al/l, /2) 

= {Tfl,f2) 

= (/i,T/2) 

= (/l, A 2 / 2 ) 

= A2(/i, /2)- 

Since by assumption Ai A 2 we must have (/i, / 2 ) = 0 as desired. 

For the next lemma note that every non-zero element of the null-space 
oi T — XI is an eigenvector with eigenvalue A. 

Lemma 6.4 Suppose T is compact, and A y^ 0. Then the dimension of 
the null space of T — XI is finite. Moreover, the eigenvalues of T form 
at most a denumerable set Ai, . . . , Xk, • . ., with Afe — > 0 as k ^ 00 . More 
specifically, for each /i > 0, the linear space spanned by the eigenvectors 
corresponding to the eigenvalues Xk with |Afe| > fi is finite- dimensional. 

Proof Let V\ denote the null-space of T — XI, that is, the eigenspace 
of T corresponding to A. If Vx is not finite-dimensional, there exists 
a countable sequence of orthonormal vectors {ipk} in Vx. Since T is 
compact, there exists a subsequence {'Pru,} such that T{{pn,f) converges. 
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But since T{^pnk) = ^'^id A 7^ 0, we conclude that </?nfc converges, 

which is a contradiction since ||</9nfc ~ |p = 2 if fc 7^ /c'. 

The rest of the lemma follows if we can show that for each > 0, there 
are only finitely many eigenvalues whose absolute values are greater than 
/t. We argue again by contradiction. Suppose there are infinitely many 
distinct eigenvalues whose absolute values are greater than /x, and let 
{ipk} be a corresponding sequence of eigenvectors. Since the eigenvalues 
are distinct, we know from the previous lemma that {ipk} is orthogonal, 
and after normalization, we may assume that this set of eigenvectors is 
orthonormal. One again, since T is compact, we may find a subsequence 
so that T{(pnk) converges, and since 

'^{‘Prik) = ^nk‘Pnk 

the fact that lAn^l > ^ leads to a contradiction, since {(fk} is an or- 
thonormal set and thus WXuk^Pnk - IP = A^^^ + X^^ > 

Lemma 6.5 Suppose T ^ 0 is compact and symmetric. Then either ||T|| 
or — ||r|| is an eigenvalue ofT. 

Proof. By the observation (7) made earlier, either 

||T||=sup{(T/,/): 11/11 = 1} or - ||r|| = inf{(r/, /) : ||/|| = 1}. 

We assume the first case, that is, 

A=||T||=sup{(T/,/): 11/11 = 1), 

and prove that A is an eigenvalue of T. (The proof of the other case is 
similar.) 

We pick a sequence {/„} C H such that ||/„|| = 1 and (T/„, fn) — > A. 
Since T is compact, we may assume also (by passing to a subsequence of 
{fn} if necessary) that {T/„} converges to a limit g ^H. We claim that 
g is an eigenvector of T with eigenvalue A. To see this, we first observe 
that T fn — Xfn —>■ 0 because 

||r/. - A/nf = ||T/„f - 2X{TfnJn) + A^||/nf 

< ||Tf||/„f -2A(T/„,/n) + A2||/,f 
<2X^ -2X{TfnJn)^0- 

Since T/„ —>5, we must have Xfn — > g, and since T is continuous, this 
implies that XT fn Tg. This proves that Xg = Tg. Finally, we must 
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have g 0, for otherwise ||Tn/n|| — > 0, hence {Tfn,fn) — > 0, and A = 
||T|| = 0, which is a contradiction. 

We are now equipped with the necessary tools to prove the spectral 
theorem. Let S denote the closure of the linear space spanned by all 
eigenvectors of T. By Lemma 6.5, the space S is non-empty. The goal 
is to prove that S = 71. If not, then since 

(9) S®S^ = 7i, 

would be non-empty. We will have reached a contradiction once 
we show that contains an eigenvector of T. First, we note that T 
respects the decomposition (9). In other words, if / e 5 then Tf e S, 
which follows from the definitions. Also, if 5 G then Tg G S^. This 
is because T is symmetric and maps S to itself, and hence 

{Tg, f) = {g,Tf) = 0 whenever g e and f e S. 

Now consider the operator Ti, which by definition is the restriction of 
T to the subspace . The closed subspace inherits its Hilbert space 
structure from Ti. We see immediately that Ti is also a compact and 
symmetric operator on this Hilbert space. Moreover, if is non-empty, 
the lemma implies that Ti has a non-zero eigenvector in . This eigen- 
vector is clearly also an eigenvector of T, and therefore a contradiction 
is obtained. This concludes the proof of the spectral theorem. 

Some comments about Theorem 6.2 are in order. If in its statement we 
drop either of the two assumptions (the compactness or symmetry of T), 
then T may have no eigenvectors. (See Exercises 32 and 33.) However, 
when T is a general bounded linear transformation which is symmetric, 
there is an appropriate extension of the spectral theorem that holds for 
it. Its formulation and proof require further ideas that are deferred to 
Chapter 6. 

7 Exercises 

1 . Show that properties (i) and (ii) in the definition of a Hilbert space (Section 2) 
imply property (iii): the Cauchy-Schwarz inequality \{f,g)\ < ||/|| • ||(;|| and the 
triangle inequality \\f + g\\ < \\f\\ + ||p||. 

[Hint: For the first inequality, consider (/ -|- Xg, f + Xg) as a positive quadratic 
function of A. For the second, write \\f + gjp as {f + g, f + <?).] 

2. In the case of equality in the Cauchy-Schwarz inequality we have the following. 
If \{fy9)\ = ll/ll llsll and fir / 0, then f = eg for some scalar c. 
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[Hint: Assume ||/|| = ||( 7 |] = 1 and (/,<?) = 1. Then f — g and g are orthogonal, 
while / = /-5 + 5 . Thus ||/f = ||/ - + llflf •] 

3. Note that ||/ + pjp = jj/jp + jjgjp + 2Re(/, g) for any pair of elements in a 
Hilbert space H. As a result, verify the identity \\f + g\\‘^ + \\f — gjp = 2(|j/|p + 

llsf). 


4. Prove from the definition that £‘^{Z) is complete and separable. 

5. Establish the following relations between L^(R‘*) and 

(a) Neither the inclusion L^(R‘^) C L^(R‘*) nor the inclusion I/^(R‘^) C L^(R‘*) 
is valid. 

(b) Note, however, that if / is supported on a set E of finite measure and if / £ 
I/^(R‘^), applying the Cauchy-Schwarz inequality to fxE gives / £ L^(R‘^), 
and 

||/||Ll(R'i) ^ m{E) ^ ||/||_L2(Rd). 

(c) If / is bounded (|/(a;)| < M), and / £ then / £ L^(R‘*) with 

[Hint: For (a) consider f{x) = |2;|~“, when |a;| < 1 or when |a;| > 1.] 

6. Prove that the following are dense subspaces of L^(R‘*). 

(a) The simple functions. 

(b) The continuous functions of compact support. 


7 . Suppose {ipk}T=i is an orthonormal basis for L^(R‘^). Prove that the collection 
{V’k,j}i<k,j<<x 3 with ipkj{x,y) = (pk(x)(pj{y) is an orthonormal basis of x 

R'^).’ 

[Hint: First verify that the are orthonormal, by Fubini’s theorem. Next, 

for each j consider Fj{x) = F'{x,y)(pj{y) dy. If one assumes that {E,ipk,j) = 0 

for all j, then J Fj{x)ipk{x) dx = 0.[ 


8. Let ri{t) be a fixed strictly positive continuous function on [a,b]. Define Tin = 
L^{[a, b], rf) to be the space of all measurable functions / on [a, b] such that 


f 


\f{t)\ v{t) dt < oo. 


Define the inner product on Ti,, by 


{f, 9 )v= [ fit)g{t)v{t)dt- 

J a 
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(a) Show that Tin is a Hilbert space, and that the mapping U : f gives 

a unitary correspondence between Tin the usual space fo]). 

(b) Generalize this to the case when rj is not necessarily continuous. 


9. Let Til = L^([— TT, tt]) be the Hilbert space of functions on the unit circle 

with inner product {F,G) = ^ F(e'^)G{e'‘^) dO. Let 712 be the space L^(R). 

Using the mapping 


i — X 

X 

i-\- X 


of R to the unit circle, show that: 

(a) The correspondence U : F ^ f, with 


gives a unitary mapping of 71i to 712 ■ 
(b) As a result, 


/ 1 , 

(i-xy 

1 1 


[_^i + x J 

i + x j 


is an orthonormal basis of L^(R). 


10. Let 5 denote a subspace of a Hilbert space 71. Prove that (5^)^ is the 
smallest closed subspace of 71 that contains S. 

11. Let P be the orthogonal projection associated with a closed subspace 5 in a 
Hilbert space TL, that is, 

!’(/) = / if /e 5 and P(/) = 0 if / G 5^. 

(a) Show that = P and P* = P. 

(b) Conversely, if P is any bounded operator satisfying P^ = P and P* = P, 
prove that P is the orthogonal projection for some closed subspace of 71. 

(c) Using P, prove that if 5 is a closed subspace of a separable Hilbert space, 
then 5 is also a separable Hilbert space. 


12. Let P be a measurable subset of R'*, and suppose S is the subspace of I/^(R‘*) 
of functions that vanish for a.e. x ^ E. Show that the orthogonal projection P on 
5 is given by P(/) = Xe • f, where xe is the characteristic function of E. 
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13 . Suppose P\ and P2 are a pair of orthogonal projections on Si and S2, respec- 
tively. Then P1P2 is an orthogonal projection if and only if Pi and P2 commute, 
that is, P1P2 = P2P\- In this case, P1P2 projects onto Si n S2. 

14 . Suppose Ti and T-C are two completions of a pre-Hilbert space Tfo- Show that 
there is a unitary mapping from Ti to Ti' that is the identity on Tfo- 

[Hint: If / G 74, pick a Cauchy sequence {/n} in 74o that converges to / in 74. This 
sequence will also converge to an element /' in 74^ The mapping / e-> /' gives the 
required unitary mapping.] 

15 . Let T be any linear transformation from 74i to 742- If we suppose that 74i is 
finite-dimensional, then T is automatically bounded. (If 74i is not assumed to be 
finite-dimensional this may fail; see Problem 1 below.) 

16 . Let Fq{z) = 1/(1 - zy. 

(a) Verify that |Lo( 2 )| < in the unit disc, but that lim, — .iL'o(?') does not 
exist. 

[Hint: Note that |Tb(j')| = 1 and Fair) oscillates between ±1 infinitely often 
as r — > 1.] 

(b) Let {onj/Li be an enumeration of the rationals, and let 

00 

T(^) = 

where 5 is sufficiently small. Show that lim, — ,i F{re^^) fails to exist when- 
ever 6 = aj, and hence F fails to have a radial limit for a dense set of points 
on the unit circle. 


17 . Patou’s theorem can be generalized by allowing a point to approach the 
boundary in larger regions, as follows. 

For each 0 < s < 1 and point 2 on the unit circle, consider the region rs( 2 ) 
defined as the smallest closed convex set that contains 2 and the closed disc Ds (0) . 
In other words, rs( 2 ) consists of all lines joining 2 with points in Ds{0). Near the 
point 2 , the region La ( 2 ) looks like a triangle. See Figure 2. 

We say that a function F defined in the open unit disc has a non-tangential 
limit at a point 2 on the circle, if for every 0 < s < 1, the limit 

lim F{w) 

™ e 


exists. 

Prove that if F is holomorphic and bounded on the open unit disc, then F has 
a non-tangential limit for almost every point on the unit circle. 

[Hint: Show that the Poisson integral of a function / has non-tangential limits at 
every point of the Lebesgue set of /.] 
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18. Let Ti denote a Hilbert space, and C{TL) the vector space of all bounded linear 

operators on Ti. Given T € we define the operator norm 

||T|| = inf{_B : ||Td|| < -B||ri||, for all v £ Ti.}. 

(a) Show that ||Ti -\-T 2 \\ < ||Ti|| + ||T2|| whenever Ti,T 2 £ C(H). 

(b) Prove that 

d(ri,r2) = llTi-Tall 

defines a metric on C(Ti). 

(c) Show that £(7i) is complete in the metric d. 

19. If T is a bounded linear operator on a Hilbert space, prove that 

||Tr*|| = ||T*r|| = ||Tf = \\T*f. 


20. Suppose Ti. is an infinite-dimensional Hilbert space. We have seen an example 
of a sequence {fn} in Ti with ||/n|| = 1 for all n, but for which no subsequence 
of {fn} converges in Ti. However, show that for any sequence {fn} in Ti with 
1 1 /nil = 1 for all n, there exist f £Ti and a subsequence {/n*,} such that for all 
g £Ti, one has 


lim ifnk,9) = {f,g)- 

k — >00 

One says that {fn^} converges weakly to /. 

[Hint: Let g run through a basis for Ti, and use a diagonalization argument. One 
can then define / by giving its series expansion with respect to the chosen basis.] 
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21. There are several senses in which a sequence of bounded operators {r„} can 
converge to a bounded operator T (in a Hilbert space Ti). First, there is con- 
vergence in the norm, that is, ||T„ — r|| — > 0, as n — > oo. Next, there is a weaker 
convergence, which happens to be called strong convergence, that requires that 
r„/— >r/, as n— >oo, for every vector / £ 7t. Finally, there is weak conver- 
gence (see also Exercise 20) that requires {Tnf,g) — > {Tf,g) for every pair of 
vectors f,g GH. 

(a) Show by examples that weak convergence does not imply strong convergence, 
nor does strong convergence imply convergence in the norm. 

(b) Show that for any bounded operator T there is a sequence {Tn} of bounded 
operators of finite rank so that r„ — > T strongly as n —> oo. 


22. An operator T is an isometry if ||T/|j = ||/|| for all f GTL. 

(a) Show that if T is an isometry, then {Tf,Tg) = (f,g) for every f,g £ Ti.. 
Prove as a result that T*T = I. 

(b) If T is an isometry and T is surjective, then T is unitary and TT* = I. 

(c) Give an example of an isometry that is not unitary. 

(d) Show that if T*T is unitary then T is an isometry. 

[Hint: Use the fact that {T f,Tf) = (/, /) for / replaced by / ± ^ and / ± ig.] 

23. Suppose {Tk} is a collection of bounded operators on a Hilbert space Ti,, with 
||Tfc|| < 1 for all k. Suppose also that 

TkT* = T^Tj = 0 for all k^j. 


Let 5jv = EL-jv T’fe. 

Show that SN{f) converges as N — > oo, for every f G Ti. If T{f) denotes the 
limit, prove that |jT|| < 1. 

A generalization is given in Problem 8* below. 

[Hint: Consider first the case when only finitely many of the Tk are non-zero, and 
note that the ranges of the Tk are mutually orthogonal.] 

24. Let denote an orthonormal set in a Hilbert space Ti. If is a 

sequence of positive real numbers such that E Cfc < oo, then the set 

OO 

A = {^akek-. \ak\<Ck} 

k=l 


is compact in Ti. 

25. Suppose r is a bounded operator that is diagonal with respect to a basis {g>k\, 
with Tifik = AfcVJfe. Then T is compact if and only if Afc — > 0. 
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[Hint: If Xk — > 0, then note that ||P„r — T|| ^ 0, where Pn is the orthogonal 
projection on the snbspace spanned by (/3i, 932 , ■ ■ ■ , ] 


26. Suppose w is a measurable function on with 0 < w{x) < 00 for a.e. x, and 
If is a measurable function on that satisfies: 

(i) / \K{x,y)\w{y) dy < Aw{x) for almost every x € R”^, and 

Jr^- 

(ii) / \K{x,y)\w{x) dx < Aw{y) for almost every y € R”^. 

jRd- 

Prove that the integral operator defined by 

Tf{x)= f K{x,y)f{y)dy, xGR’^ 

JR'^ 

is bounded on L^(R‘*) with ||T|| < A. 

Note as a special case that if J \K{x, y) \ dy < A for all x, and J \K{x, y)\ dx < A 
for all y, then ||r|| < A. 

[Hint: Show that if / € Z/^(R'^), then 


J \K{x,y)\\f{y)\dy < A'-^^w{x)'^^^ 


1/2 


\K{x,y)\\f{y)fwiy) ^ dy .] 


27. Prove that the operator 



fjy) 

x + y 


dy 


is bounded on L^(0,oo) with norm [[r[[ < 1. 
[Hint: Use Exercise 26 with an appropriate u).| 


28. Suppose Ti. — L^{B), where B is the unit ball in R**. Let K{x,y) be a mea- 
surable function on B x B that satisfies \K{x,y)\ < A\x — y\~‘^^°‘ for some a > 0, 
whenever x,y G B. Define 


Tf{x) = [ K{x,y)f{y)dy. 

J B 

(a) Prove that T is a bounded operator on Ti. 

(b) Prove that T is compact. 

(c) Note that T is a Hilbert-Schmidt operator if and only if q > d/2. 

[Hint: For (b), consider the operators r„ associated with the truncated kernels 
Kn(x, y) = K{x, y) if \x — y\ > 1/n and 0 otherwise. Show that each r„ is compact, 
and that [[r„ — T[[ ^ 0 as n ^ 00 . [ 

29. Let T be a compact operator on a Hilbert space 7d, and assume A yf 0. 
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(a) Show that the range of XI — T defined by 

{g € H ■■ g = {XI — T)f, for some / G H} 

is closed. [Hint: Suppose gj ^ g, where gj — {XI — T)fj. Let V\ denote 
the eigenspace of T corresponding to A, that is, the kernel of XI — T. Why 
can one assume that fj G V^? Under this assumption prove that {/,} is a 
bounded sequence.] 

(b) Show by example that this may fail when A = 0. 

(c) Show that the range of XI — T is all of TL if and only if the null-space of 
XI — T* is trivial. 


30 . Let TL = L^([— 7r,7r]) with [— tt, tt] identified as the unit circle. Fix a bounded 
sequence {An}[JL_<x) of complex numbers, and define an operator T/ by 

OO OO 

Tf{x) ~ A„a„e*”^ whenever f{x) ~ 

n= — OO n = — OO 


Such an operator is called a Fourier multiplier operator, and the sequence 
{An} is called the multiplier sequence. 

(a) Show that T is a bounded operator on TL and ||r|| = sup„ |An|. 

(b) Verify that T commutes with translations, that is, if we define Th{x) = 
f{x — h) then 

T o Th = Th o T for every h G R. 

(c) Conversely, prove that if T is any bounded operator on TL that commutes 
with translations, then T is a Fourier multiplier operator. [Hint: Consider 
r(e*"").] 


31 . Consider a version of the sawtooth function defined on [— rr, tt) by® 

K{x) = i{sgn{x)'K — x), 

and extended to R with period 27r. Suppose / G L^([— tt, tt]) is extended to R with 
period 27r, and define 

r/(a:) = ^ / K{x-y)f{y)dy 
= ^ y K{y)f{x - y)dy. 

^The symbol sgn{x) denotes the sign function: it equals 1 or —1 if x is positive or 
negative respectively, and 0 if a: = 0. 
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(a) Show that F{x) = Tf{x) is absolutely continuous, and if f{y)dy = 0, 
then F' (x) = if{x) a.e. x. 

(b) Show that the mapping / i— > T f is compact and symmetric on L^([— tt, tt]). 

(c) Prove that y}{x) £ Z/^([— tt, tt]) is an eigenfunction for T if and only if ip{x) 
is (up to a constant multiple) equal to for some integer n 7 ^ 0 with 
eigenvalue 1 /n, or ip(x) = 1 with eigenvalue 0 . 

(d) Show as a result that {e*"“^}nez is an orthonormal basis of tt, tt]). 

Note that in Book I, Chapter 2, Exercise 8 , it is shown that the Fourier series 
of A' is 


K{x) ~ ^ 

n^O 


^inx 

n 


32. Consider the operator T : L^{[0, 1]) ^ ^^([0, 1]) defined by 

Tif){t)=tf{t). 

(a) Prove that T is a bounded linear operator with T — T* , but that T is not 
compact. 

(b) However, show that T has no eigenvectors. 


33. Let be a Hilbert space with basis Verify that the operator T 

defined by 

T{>fik) = ^ ifik+i 

is compact, but has no eigenvectors. 

34. Let if be a Hilbert-Schmidt kernel which is real and symmetric. Then, as we 
saw, the operator T whose kernel is K is compact and symmetric. Let {ipk{x)} be 
the eigenvectors (with eigenvalues Xk) that diagonalize T. Then: 

(a) Efc < 00- 

(b) K{x, y) ~ E Xk^k{x)(pk{y) is the expansion of K in the basis {ipk{x)ipk{y)}- 

(c) Suppose T is a compact operator which is symmetric. Then T is of Hilbert- 
Schmidt type if and only if E„ \^n\^ < where {An} are the eigenvalues 
of T counted according to their multiplicities. 


35. Let 74 be a Hilbert space. Prove the following variants of the spectral theorem. 


202 


Chapter 4. HILBERT SPACES: AN INTRODUCTION 


(a) If Ti and T2 are two linear symmetric and compact operators on Ti that 
commute (that is, T\T2 = T2T1), show that they can be diagonalized simul- 
taneously. In other words, there exists an orthonormal basis for Ti which 
consists of eigenvectors for both Ti and T2. 

(b) A linear operator on Ti. is normal if TT* = T*T. Prove that if T is normal 
and compact, then T can be diagonalized. 

[Hint: Write T = Ti + iT2 where Ti and T2 are symmetric, compact and 
commute.] 

(c) If U is unitary, and U = XI — T, where T is compact, then U can be diago- 
nalized. 


8 Problems 

1 . Let H be an infinite-dimensional Hilbert space. There exists a linear functional 
i defined on H that is not bounded (and hence not continuous). 

[Hint: Using the axiom of choice (or one of its equivalent forms), construct an 
algebraic basis of Ti., {ca}; it has the property that every element of Ti is uniquely 
a finite linear combination of the {ca}- Select a denumerable collection {e„}^=i, 
and define £ to satisfy the requirement that i{en) = ’i'||en|| for all n G N.] 

2 . * The following is an example of a non-separable Hilbert space. We consider 

the collection of exponentials on R, where A ranges over the real numbers. 

Let Tio denote the space of finite linear combinations of these exponentials. For 
f,g ^ Tio, we define the inner product as 

lim iL f fix)gix)dx. 

1 ^00 Z1 J _ ji 

(a) Show that this limit exists, and 

N 

if, 9) = 

k=l 

if fix) = ELi and g{x) = Ef=i 

(b) With this inner product Tio is a pre-Hilbert space. Notice that ||/|| < 
sup,j, |/(a;)|, if fGTio, where ||/|| denotes the norm (/, Let Ti be 
the completion of Tio- Then Ti is not separable because and ^ are 
orthonormal if A yf A'. 

A continuous function F defined on R is called almost periodic if it is the 
uniform limit (on R) of elements in Tio- Such functions can be identified 
with (certain) elements in the completion Ti'- We have Tio C AP C Ti, where 
AP denotes the almost periodic functions. 
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(c) A continuous function F is in AP if for every e > 0 we can find a length 
L = Li such that any interval 7 C R of length L contains an “almost period” 
T satisfying 


sup |F(a; + r) — -F(®)| < e. 


X 


(d) An equivalent characterization is that F is in AP if and only if every se- 
quence F{x + hn) of translates of F contains a subsequence that converges 
uniformly. 


3. The following is a direct generalization of Patou’s theorem: if u(re*®) is harmonic 


in the unit disc and bounded there, then lim, — ,i «(re*®) exists for a.e. 0. 



From this one can proceed as in the proof of Theorem 3.3.] 

4.* This problem provides some examples of functions that fail to have radial limits 
almost everywhere. 

(a) At almost every point of the boundary unit circle, the function 
fails to have a radial limit. 

(b) More generally, suppose F{z) = ■ Then, if ^|a„|^ = oo the 

function F fails to have radial limits at almost every boundary point. How- 
ever, if janp < oo, then F G 77^ (B), and we know by the proof of Theo- 
rem 3.3 that F does have radial limits almost everywhere. 


5.* Suppose F is holomorphic in the unit disc, and 



where log"*" u — logn if rt > 1, and log"*" m = 0 if u < 1. 
Then lim^^i F(rF'^) exists for almost every 6. 

The above condition is satisfied whenever (say) 



for some p > 0, 


(since > pu, u > 0). 

Functions that satisfy the latter condition are said to belong to the Hardy 
space 77P(D). 

6.* If T is compact, and A yf 0, show that 


®See also Section 5, Chapter 2 in Book I. 
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(a) A7 — T is injective if and only if XI — T* is injective. 

(b) XI — T is injective if and only if A7 — T is surjective. 

This result, known as the Fredholm alternative, is often combined with that in 
Exercise 29. 

7 . Show that the identity operator on 7/^(R'^) cannot be given as an (absolutely) 
convergent integral operator. More precisely, if K{x,y) is a measurable function 
on R'* X R'* with the property that for each / C 7/^(R‘*), the integral T{f){x) = 
Jjjd K{x,y)f{y) dy converges for almost every a;, then T(/) yf / for some /. 

[Hint: Prove that otherwise for any pair of disjoint balls 73i and B 2 in R'*, we 
would have that K{x, y) = Q for a.e. {x, y) £ Bi x B 2 -] 

8. * Suppose {Tfc} is a collection of bounded opeartors on a Hilbert space 77. As- 
sume that 


\\TkT-\\<ak-j and \\TkTj\\<al_j, 

for positive constants {a„} with the property that X]”oo “n = A < 00 . Then 
Snif) converges as N ^ 00 , for every f £H, with Sn = Moreover, 

T = limjv^oo Sn satisfies ||r|| < A. 


9. A discussion of a class of regular Sturm-Liouville operators follows. Other 
special examples are given in the problems below. 

Suppose [a, 6] is a bounded interval, and L is defined on functions / that are 
twice continuously differentiable in [a, 6] (we write, / £ C^([a, 6])) by 


L{f){x) 


dx^ 


q{x)f{x). 


Here the function q is continuous and real- valued on [a,b], and we assume for 
simplicity that q is non-negative. We say that (p £ C^([a,6]) is an eigenfunction 
of L with eigenvalue fi if L{ip) = under the assumption that (/? satisfies the 
boundary conditions :p(a) = 'p{b) = 0. Then one can show: 

(a) The eigenvalues y are strictly negative, and the eigenspace corresponding 
to each eigenvalue is one-dimensional. 

(b) Eigenvectors corresponding to distinct eigenvalues are orthogonal in ( [a, 6] ) . 

(c) Let K{x,y) be the “Green’s kernel” defined as follows. Choose y>-{x) to be 
a solution of L{ip-) = 0, with (p-(a) = 0 but (p'-(a) ^ 0. Similarly, choose 
'P+{x) to be a solution of L{ip+) = 0 with ip+{b) — 0, but ‘fi+{b) yf 0. Let 
w = 'p'^{x)'p-{x) — ifi'_{x)(p+{x), be the “Wronskian” of these solutions, and 
note that tn is a non-zero constant. 

Set 

iia<x<y<b, 
ifa<y<x<b. 

W — ^ — — 


K{x,y) 
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Then the operator T defined by 

T{f){x)= f K{x,y)f{y)dy 
J a 

is a Hilbert-Schmidt operator, and hence compact. It is also symmetric. 
Moreover, whenever / is continuons on [a,b], Tf is of class C‘^{[a,b]) and 

L{Tf) = f. 

(d) As a result, each eigenvector of T (with eigenvalue A) is an eigenvector of L 
(with eigenvalue y — 1/A). Hence Theorem 6.2 proves the completeness of 
the orthonormal set arising from normalizing the eigenvectors of L. 


10.* Let L be defined on (7^([— 1, 1]) by 

L(/)(.) = (l-x^)^-2x|. 

If ipn is the n**' Legendre polynomial, given by 

</5n(a:) = (l-x^)", n = 0,1,2,..., 

then L(pn = —n{n + l)(p„. 

When normalized the ipn form an orthonormal basis of {[—1,1]) (see also 
Problem 2, Chapter 3 in Book I, where ipn is denoted by Ln.) 


11.* The Her mite functions hk{x) are defined by the generating identity 


J2^k{x) 


k] 


_ /2-2tx+t‘‘) 


(a) They satisfy the “creation” and “annihilation” identities {x — hk{x) = 
hk+i{x) and (a; + hk{x) = hk-i{x) for fc > 0 where h-\{x) = 0. Note 
that ho{x) = e~^ hi{x) = 2xe~^ and more generally hk{x) = 
Pk{x)e~^ , where Pk is a polynomial of degree k. 

(b) Using (a) one sees that the hk are eigenvectors of the operator L = —d^fdx^ + 
x^, with L{hk) = \khk, where Xk = 2k + 1. One observes that these func- 
tions are mutually orthogonal. Since 

f [hk{x)f dx = 7r^''^2'‘fc! = Ck, 

Jm. 

we can normalize them obtaining a orthonormal sequence {Hk}, with Hk = 
c'l^^'^hk- This sequence is complete in L^(R'^) since fHk dx = 0 for all k 
implies ///(^ /(®)e-T+2‘“= da: = 0 for all t £ C. 
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(c) Suppose that K{x,y) = and also F{x) = T{f){x) = 

J^K{x,y)f{y)dy. Then T is a symmetric Hilbert-Schmidt operator, and 
if / ~ Er=o then F ~ Er=o 

One can show on the basis of (a) and (b) that whenever / C I/^(R), not only is 
F £ L^(R), but also x^F{x) £ Moreover, F can be corrected on a set of 

measure zero, so it is continuously differentiable, F' is absolutely continuous, and 
F” £ L^(R). Finally, the operator T is the inverse of L in the sense that 

LT{f ) = LF = -F” + x^F = f for every / G L^{R). 


(See also Problem 7* in Chapter 5 of Book I.) 


5 Hilbert Spaces: Several 
Examples 


What is the difference between a mathematician and 
a physicist? It is this: To a mathematician all Hilbert 
spaces are the same; for a physicist, however, it is their 
different realizations that really matter. 

Attributed to E. Wigner, ca. 1960 


Hilbert spaces arise in a large number of different contexts in analysis. 
Although it is a truism that all (infinite-dimensional) Hilbert spaces are 
the same, it is in fact their varied and distinct realizations and separate 
applications that make them of such interest in mathematics. We shall 
illustrate this via several examples. 

To begin with, we consider the Plancherel formula and the resulting 
unitary character of the Fourier transform. The relevance of these ideas 
to complex analysis is then highlighted by the study of holomorphic func- 
tions in a half-space that belong to the Hardy space That function 
space itself is another interesting realization of a Hilbert space. The con- 
siderations here are analogous to the ideas that led us to Fatou’s theorem 
for the unit disc, but are of a more involved character. 

We next see how complex analysis and the Fourier transform com- 
bine to guarantee the existence of solutions to linear partial differential 
equations with constant coefficients. The proof relies on a basic es- 
timate, which once established can be exploited by simple Hilbert space 
techniques. 

Our final example is Dirichlet’s principle and its applications to the 
boundary value problem for harmonic functions. Here the Hilbert space 
that arises is given by Dirichlet’s integral, and the solution is expressed 
by aid of an appropriate orthogonal projection operator. 

1 The Fourier transform on 

The Fourier transform of a function / on is defined by 

/( 0 = / 


( 1 ) 
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and its attached inversion is given by 

(2) f{x)= [ 

These formulas have already appeared in several different contexts. 
We considered first (in Book I) the properties of the Fourier transform 
in the elementary setting by restricting to functions in the Schwartz class 
The class S consists of functions / that are smooth (indefinitely 
differentiable) and such that for each multi-index a and /3, the function 
is bounded on We saw that on this class the Fourier trans- 
form is a bijection, that the inversion formula (2) holds, and moreover 
we have the Plancherel identity 

( 3 ) [ \f{0\‘^d^= [ \f{x)\‘^dx. 

Js.'i 

Turning now to more general (in particular, non-continuous) functions, 
we note that the largest class for which the integral defining /(^) con- 
verges (absolutely) is the space For it, we saw in Chapter 2 that 

a (relative) inversion formula is valid. 

Beyond these particular facts, what we would like here is to reestablish 
in the general context the symmetry between / and / that holds for S. 
This is where the special role of the Hilbert space L^(IR‘^) enters. 

We shall define the Fourier transform on T^(IR‘^) as an extension of its 
definition on S. For this purpose, we temporarily adopt the notational 
device of denoting by J-q and T the Fourier transform on S and its 
extension to respectively. 

The main results we prove are the following. 

Theorem 1.1 Tht Fourier transform J-q, initially defined on S{W^), 
has a (unique) extension J- to a unitary mapping of to itself. In 

partieular, 

l|•^(/)llL2(K<i) = ||/||L2(M'i) 

for all f € T2(iRd). 

The extension T will be given by a limiting process: if {/«} is a sequence 
in the Schwartz space that converges to / in L^(M'^), then {tFoifn)} will 


^Recall that ^ • x))'* and ' ' ' ( i where a = 

{ai, . . . ^ad) and /3 = {/3i, . . . , with aj and f3j positive integers. The order of a is 
denoted by |q;| and defined to be qi + • • • + 
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converge to an element in which we will define as the Fourier 

transform of /. To implement this approach we have to see that every 
function can be approximated by elements in the Schwartz space. 

Lemma 1.2 The spaee is dense in In other words, given 

any f G there exists a sequence {fn} C such that 


\\f - fnWms.'i) ^ 0 asn->(xi. 

For the proof of the lemma, we fix / € and e > 0. Then, for 

each M > 0, we define 


gM{x) 


f\x) if |a:| < M and |/(a:)| < M, 
0 otherwise. 


Then, |/(x) - gM{.x)\ < 2|/(x)|, hence \f{x) - gM{x)\^ < 4|/(x)p, and 
since gM{x) — > f{x) as M — > oo for almost every x, the dominated con- 
vergence theorem guarantees that for some M, we have 


11/ - fi'M||L2(R‘i) < e- 

We write g = gM, note that this function is bounded and supported on 
a bounded set, and observe that it now suffices to approximate g by 
functions in the Schwartz space. To achieve this goal, we use a method 
called regularization, which consists of “smoothing” g by convolving it 
with an approximation of the identity. Consider a function ip{x) on 
with the following properties: 

(a) (p is smooth (indefinitely differentiable). 

(b) ip is supported in the unit ball. 

(c) p >Q. 

(d) / pix) dx = 1. 

Jr‘^ 

For instance, one can take 


p{x) 


c e i-TcF 

0 


if |a:| < 1, 
if |a:| > 1, 


where the constant c is chosen so that (d) holds. 

Next, we consider the approximation to the identity defined by 

Ks{x) = 5~‘^p{x/6). 
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The key observation is that g * Ks belongs to with this convolu- 

tion in fact bounded and supported on a fixed bounded set, uniformly in 
5 (assuming for example that 5 < 1). Indeed, we may write 

{g * Ks){x) = j g{y)Ks{x -y)dy = j g{x - y)Ks{y) dy, 

in view of the identity (6) in Chapter 2. We note that since g is supported 
on some bounded set and Ks vanishes outside the ball of radius <5, the 
function g * Ks is supported in some fixed bounded set independent of 5. 
Also, the function g is bounded by construction, hence 

\{g*Ks){x)\< j \g{x - y)\Ks{y)dy 

< sup \g{z)\ / Ks{y)dy= sup \g{z)\, 

J zes.‘‘ 

which shows that g * Ks is also uniformly bounded in 5. Moreover, from 
the first integral expression for g * Ks above, one may differentiate under 
the integral sign to see that g * Ks is smooth and all of its derivatives 
have support in some fixed bounded set. 

The proof of the lemma will be complete if we can show that g * Ks 
converges to g in Now Theorem 2.1 in Chapter 3 guarantees 

that for almost every x, the quantity * Ks){x) — g{x)\^ converges to 0 
as 5 tends to 0. An application of the bounded convergence theorem 
(Theorem 1.4 in Chapter 2) yields 

\\{g * Ks) - g\\l^^^-) 0 as 5 ^ 0. 

In particular, \\{g * Ks) — g\\L^(M.dj < e for an appropriate 5 and hence 
\\f — g * Ks\\L 2 ^^d'j < 2e, and choosing a sequence of e tending to zero 
gives the construction of the desired sequence {/«}. 

For later purposes it is useful to observe that the proof of the above 
lemma establishes the following assertion: if / belongs to both K(R'^) 
and L^(IR‘^), then there is a sequence {/«}, /« G S{R‘^), that converges 
to / in both the L^-norm and the L^-norm. 

Our definition of the Fourier transform on L‘^(R'^) combines the above 
density of S with a general “extension principle.” 

Lemma 1.3 Let Tii and Ti .2 denote Hilbert spaees with norms || ■ ||i and 
II ■ II 2 , respectively. Suppose S is a dense subspaee of Hi and Tq : 5 — > H 2 
a linear transformation that satisfies ||To (/)||2 < c||/||i whenever f d S. 
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Then Tq extends to a (unique) linear transformation T : Hi H2 that 
satisfies ||r(/)||2 < c||/||i for all f e Hi. 

Proof Given / e let {/„} be a sequence in S that converges to /, 
and define 

T(/) = lim To(/n), 

n— ^00 

where the limit is taken in H^- To see that T is well-defined we must 
verify that the limit exists, and that it is independent of the sequence 
{fn} used to approximate /. Indeed, for the first point, we note that 
{T{fn)} is a Cauchy sequence in H2 because by construction {fn} is 
Cauchy in Hi, and the inequality verified by Tq yields 

||To(/n) - To(/m)||2 < c\\fn “ /m||i ^ 0 as n,m ^ OO] 

thus {ro(/n)} is Cauchy, hence converges in 7 ^ 2 - 

Second, to justify that the limit is independent of the approximating 
sequence, let [gn] be another sequence in S that converges to / in Hi. 
Then 


l|To(/n) - To{gn )\\2 < c||/n “ gn\\l, 

and since II /„- 111 < ||/n - /||i -f ||/ - ffnili, we conclude that {To(ffn)} 
converges to a limit in H2 that equals the limit of {To{fn)}. 

Finally, we recall that if /„ ^ / and To(/„) ^ T{f), then ||/n||i ^ 
ll/lli and ||ro(/n)||2 — > I|T(/)||2, so in the limit as n — > 00, the inequality 
iiT(/)||2 < c||/||i holds for all / e Hi. 

In the present case of the Fourier transform, we apply this lemma with 
Hi = H2 = (equipped with the T^-norm), S = and Tq = 

J-Q the Fourier transform defined on the Schwartz space. The Fourier 
transform on is by definition the unique (bounded) extension of 

Hq to L"^ guaranteed by Lemma 1 . 3 . Thus if / S and {/„} is any 

sequence in 5 (M‘^) that converges to / (that is, ||/ — fn\\L^(M.d) — > 0 as 
n — > 00), we define the Fourier transform of / by 

( 4 ) Hf) = lim Mfn), 

n — ^cxD 

where the limit is taken in the sense. Clearly, the argument in the 
proof of the lemma shows that in our special case the extension T con- 
tinues to satisfy the identity ( 3 ): 

= ll/llL2(M<i) whenever / € 
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The fact that T is invertible on I? (and thus ^ is a unitary mapping) 
is also a consequence of the analogous property on Recall that 

on the Schwartz space, is given by formula (2), that is, 

^o\9){x)= [ 

and satisfies again the identity ||•?y^^( 5 )||L 2 = ||(;||l 2- Therefore, arguing 
in the same fashion as above, we can extend to by a limiting 

argument. Then, given / G we choose a sequence {/„} in the 

Schwartz space so that ||/ — /nlU^ 0. We have 

U = 

and taking the limit as n tends to infinity, we see that 
f = T-^T{f) = TT-\f), 


and hence T is invertible. This concludes the proof of Theorem 1.1. 

Some remarks are in order. 

(i) Suppose / belongs to both and Are the two definitions 

of the Fourier transform the same? That is, do we have .F(/) = /, with 
.F(/) defined by the limiting process in Theorem 1.1 and / defined by the 
convergent integral (1)? To prove that this is indeed the case we recall 
that we can approximate / by a sequence {/«} in 5 so that /« — > / both 
in the T^-norm and the L^-norm. Since J-Q{fn) = In, a passage to the 
limit gives the desired conclusion. In fact, converges to 9-{f) in 

the T^-norm, so a subsequence converges to J-{f) almost everywhere; see 
the analogous statement for in Corollary 2.3, Chapter 2. Moreover, 


sup \fniO - fiOl < ll/n 




hence /„ converges to / everywhere, and the assertion is established. 

(ii) The theorem gives a rather abstract definition of the Fourier trans- 
form on L^. In view of what we have just said, we can also define the 
Fourier transform more concretely as follows. If / G Z/^(R‘^), then 

/(0= lim [ dx, 

J\x\<R 

where the limit is taken in the T^-norm. Note in fact that if xr denotes 
the characteristic function of the ball {a: G : |x| < i?}, then for each 
R the function fxR is in both and and fxR ^ / in the L^-norm. 
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(iii) The identity of the various definitions of the Fourier transform dis- 
cussed above allows us to choose / as the preferred notation for the 
Fourier transform. We adopt this practice in what follows. 


2 The Hardy space of the upper half-plane 

We will apply the theory of the Fourier transform to holomorphic 
functions in the upper half-plane. This leads us to consider the relevant 
analogues of the Hardy space and Fatou’s theorem discussed in the previ- 
ous chapter.^ It incidentally provides an answer to the following natural 
question: What are the functions / G L^(IR) whose Fourier transforms 
are supported on the half-line (0, oo)? 

Let = {z = X + iy, x G M, j/ > 0} be the upper half-plane. We 
define the Hardy space to consist of all functions F analytic 

in with the property that 


(5) 


sup / \F{x + dx < oo. 

y>o Jm 


We define the corresponding norm, ||F||j:^ 2 (k 2 to be the square root of 
the quantity (5). 

Let us first describe a (typical) example of a function F in L/'^(R^). 
We start with a function Fq that belongs to L^(0,oo), and write 

^OO 

(6) F{x + iy)= d^, z = x + iy,y>{^. 

Jo 


(The choice of the particular notation Fq will become clearer below.) 
We claim that for any 5 > 0 the integral (6) converges absolutely and 
uniformly as long as y > 6. Indeed, |Fo(^)e^’^*^^| = |Fo(.^)|e“^^^^, hence 
by the Cauchy-Schwarz inequality 





1*00 \ 1/2 / /.oo \ 1/2 


from which the asserted convergence is established. From the uniform 
convergence it follows that F{z) is holomorphic in the upper half-plane. 
Moreover, by Plancherel’s theorem 

[ \F{x + iy)\^dx= [ |Fo(Ope-4-«^de< ||Fo||i 2 (o,oo), 

Jr Jo 


^Further motivation and some elementary background material may be found in The- 
orem 3.5 in Chapter 4 of Book II. 
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and in fact, by the monotone convergence theorem, 

sup/ |F(a: + iy)pdx = ||Fo||i 2 (o,oo)- 
v>0 Js. 

In particular, F belongs to The main result we prove next is 

the converse, that is, every element of the space is in fact of the 

form (6). 

Theorem 2.1 The elements F in are exactly the functions 

given by (6), with Fq G L^(0,oo). Moreover 

= ||To||i2(o,oo)- 

This shows incidentally that is a Hilbert space that is isomorphic 

to T^(0,oo) via the correspondence (6). 

The crucial point in the proof of the theorem is the following fact. For 
any fixed strictly positive ?/, we let Fy{^) denote the Fourier transform 
of the Lf function F{x + iy), a: G R. Then for any pair of choices of y, 
y\ and y 2 , we have that 

(7) Fy, = Fy , for a.e. ^ 

To establish this assertion we rely on a useful technical observation. 

Lemma 2.2 If F belongs to then F is bounded in any proper 

half-plane {z = x + iy, y > 5}, where <5 > 0. 

To prove this we exploit the mean- value property of holomorphic func- 
tions. This property may be stated in two alternative ways. First, in 
terms of averages over circles, 

1 

(8) f{Q=— F[C,+re^^)de if 0 < r < 5. 

271 Jq 

(Note that if C, lies in the upper half-plane, Im((^) > <5, then the disc 
centered at C, of radius r belongs to R+.) Alternatively, integrating over 
r, we have the mean-value property in terms of discs, 

(9) F{C)=-^[ F{C + z)dxdy, z = x + iy. 

770 J\z\<S 

These assertions actually hold for harmonic functions in (see Corol- 
lary 7.2, Chapter 3 in Book II for the result about holomorphic functions. 
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and Lemma 2.8, Chapter 5 in Book I for the case of harmonic functions); 
later in this chapter we in fact prove the extension of (9) to 
From (9) we see from the Cauchy- Schwarz inequality that 

\F{C)\^<^ f \F{C + z)\^dxdy. 

J\z\<5 

Writing z = x + iy and C = ? + with rj > 6, we see that the disc 
Bs{C) of center and radius <5 is contained in the strip {z + C, z = 
X -f iy, —5 < y < 6}, and moreover this strip lies in the half- plane 
See Figure 1. 



This gives the following majorization: 

[ \F{C + z)\^dxdy < [ f \F{C + x + iy)\'^ dx dy 
J\z\<5 J\y\<6JB. 

<26 sup / \F{x + i{r] + y))\'^ dx. 

-5<y<5 Jm 

Recalling that y > 6, we see that the last expression is in fact majorized 
by 

25 sup / \F{x + iy)\'^ dx = 26 \\F\\]j 2 m 2 y 
y>0 Jr ^ +’ 

In all |F((^)p < :^||F ||^2 in the half-plane Im(<^) > 0, which proves the 
lemma. 

We now turn to the proof of the identity (7). Starting with F in 
we improve it by replacing it with the function F’^ defined by 

F^{z) = F{z)-7 - — - — with e > 0. 

(1 — lez)^ 

Observe that \F'^{z)\ < \F{z)\ when Im(z) > 0; also F^{z) — > F{z) for 
each such z, as e —>■ 0. This shows that for each y > 0, F'^{x + iy) — > 
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F{x + iy) in the L^-norm. Moreover, the lemma guarantees that each 
satisfies the decay estimate 

F‘^{z) = O — 2 ^ whenever Im( 2 ;) > <5, for some 5 > 0. 

We assert first that (7) holds with F replaced by F’^. This is a simple 
consequence of contour integration applied to the function 

G{z) = 

In fact we integrate G(z) over the rectangle with vertices —R+iyi, R + 
iyi, R + iy2, —R + iy2, and let i? — > oo. If we take into account that 
G{z) = 0(I/(I + x^)) in this rectangle, then we find that 

I G{z) dz= G{z) dz, 

J L\ J L'2 

where Lj is the line {x + iyj ■. a; € M}, j = 1, 2 . Since 
[ G{z)dz= [ + 

Jl, Jm. 


This means that 




Since F'^{x + iyj) — > F{x + iyj) in the L^-norm as e — > 0, we then ob- 
tain (7). 

The identity we have just proved states that Fy{£)e^'^y^ is independent 
oiy,y > 0, and thus there is a function Fq{^) so that Fy{^)e^'^^y = To(^); 
as a result 

Fyi.0 = for all y > 0. 

Therefore by Plancherel’s identity 

[ \F{x + iy)\^dx= [ 

Jw Jm 

and hence 




sup 

y>0 


< oo. 
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Finally this in turn implies that Fo(0 = 0 for almost every ^ € (— oo, 0). 
For if this were not the case, then for appropriate positive numbers a, b, 
and c we could have that 1^0(01 > a for ^ in a set E in (— oo, —6), with 
m{E) > c. This would give f which grows 

indefinitely as y — > oo. The contradiction thus obtained shows that Eo(0 
vanishes almost everywhere when ^ G (— oo,0). 

To summarize, for each y > 0 the function Ey(^) equals EQ(^)e~^^^^, 
with Eq G L^(0, oo). The Fourier inversion formula then yields the repre- 
sentation (6) for an arbitrary element of and the proof of the theorem 
is concluded. 

The second result we deal with may be viewed as the half-plane ana- 
logue of Fatou’s theorem in the previous chapter. 

Theorem 2.3 Suppose E belongs to Then hmy_>o E{x -|- iy) = 

Eq[x) exists in the following two senses: 

(i) As a limit in the L'^{M.)-norm. 

(ii) As a limit for almost every x. 

Thus E has boundary values (denoted by Fq) in either of the two senses 
above. The function Eq is sometimes referred to as the boundary-value 
function of /. The proof of (i) is immediate from what we already know. 
Indeed, if Eq is the function whose Fourier transform is Fq, then 



and this tends to zero as y — > 0 by the dominated convergence theorem. 

To prove the almost everywhere convergence, we establish the Poisson 
integral representation 



( 10 ) 


with 



the Poisson kernel.^ This identity holds for every {x,y) G and any 
function / in T^(R). To see this, we begin by noting the following ele- 
mentary integration formulas: 


( 11 ) 



^This is the analogue in M of the identity (3) for the circle, given in Chapter 4. 
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and 


( 12 ) 


/ 

Jr 


,-2vi-|4|?/p27rigx ^ y 


if 2/ > 0. 


TT 2/^ + 

The first is an immediate consequence of the fact that 

rN 


1 


'0 


- 1 ] 

27riz 


if we let — > oo. To prove the second formula, we write the integral as 

^OO ^OO 

/ dC, 

Jo Jo 


which equals 


i 


1 


1 


X + iy —X + iy 


1 y 


TT y^ + x^ 


by (11)- 

Next we establish (10) when / belongs to (say) the space S. Indeed, for 
fixed (x, 2/) G consider the function $(t, ^) = 

onM^ = {(^,t)}. Since |$(t, ^)| = |/(t)|e“^’^l^l^, then (because / is rapidly 
decreasing) $ is integrable over Applying Fubini’s theorem yields 

[ ( [ ^t,0d^) dt = [ ( [ ^t,0dt) de 
Jr \Jr / Jr \Jr J 

The right-hand side obviously gives /(^)e“2’r|5l«/g27i-*a:€ while the 
left-hand side yields f(t)Vy{x — y) dt in view of (12) above. However, 
if we use the relation (6) in Chapter 2 we see that 

[ f{t)'Py{x - y) dt = f f{x -t)Vy{t)dt. 

</ K i/ M 

Thus the Poisson integral representation (10) holds for every f & S. For 
a general / G I/^(M) we consider a sequence {fn} of elements in 5, so 
that fn^f (and also fn — > /) in the L^-norm. A passage to the limit 
then yields the formula for / from the corresponding formula for each 
fn- Indeed, by the Cauchy- Schwarz inequality we have 




< 11/ - /n||L2 


^-AiT\^\y 


df 


1/2 
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and also 



t) - fn{x 


t)]Vy{t)dt 


< 11 / - fnh^ 


\Vy{t)\^dt 


1/2 


and the right-hand sides tend to 0 because for each fixed {x,y) G the 
functions 'Pyit), t G R, belong to L‘^{R). 

Having established the Poisson integral representation (10), we return 
to our given element F G We know that there is an function 

Fo{() (which vanishes when C < 0) such that (6) holds. With Fq the 
I/^(IR) function whose Fourier transform is Fo{^), we see from (10), with 
f = Fq, that 

F{x + iy) = / Fo{x - t)Vy{t) dt. 

Jr 


From this we deduce the fact that F{x + iy) — > Fq{x) a.e in a: as y — > 0, 
since the family {Vy} is an approximation of the identity for which The- 
orem 2.1 in Chapter 3 applies. There is, however, one small obstacle that 
has to be overcome: the theorem as stated applied to functions and 
not to functions in L^. Nevertheless, given the nature of the approxima- 
tion to the identity, a simple “localization” argument will succeed. We 
proceed as follows. 

It will suffice to see that for any large N, which is fixed, F{x -|- iy) — > 
Fq{x), for a.e x with |x| < N. To do this, decompose Fq as G + H, where 
G{x) = Fo{x) when |j;| > 2N, G{x) = 0 when |j;| > 2N; thus H{x) = 0 
if |a:| < 2N but \H{x)\ < |Fo(a;)|- Note that now G G and 


/ Fq{x — t)Vy{t) dt = / G{x — t)Vy{t) dt + / F[{x — t)Vy{t) dt. 

Jr Jr Jr 

Therefore, by the above mentioned theorem in Chapter 3, the first in- 
tegral on the right-hand side converges for a.e x to G{x) = Fq{x) when 
|a:| < N. While when |a;| < N the integrand of the second integral van- 
ishes when |t| < N" (since then \x — t\ < 2N). That integral is therefore 
majorized by 






\Vy{t)\^dt 


1/2 


1 /2 

However {x — t)\‘^ dt) < ||To||i 2 , while (as is easily seen) 

i|t|>iv dt — > 0 as y — > 0. Hence F{x + iy) — > Fq{x) for a.e x with 
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|j;| < A^, as ?/ — > 0, and since N is arbitrary, the proof of Theorem 2.3 is 
now complete. 

The following comments may help clarify the thrust of the above the- 
orems. 

(i) Let S be the subspace of (R) consisting of all functions Fq arising in 
Theorem 2.3. Then, since the functions Fq are exactly those functions in 
Z/^ whose Fourier transform is supported on the half- line (0, oo), we see 
that S' is a closed subspace. We might be tempted to say that S consists 
of those functions in that arise as boundary values of holomorphic 
functions in the upper half-plane; but this heuristic assertion is not exact 
if we do not add a quantitative restriction such as in the definition (5) 
of the Hardy space. See Exercise 4. 

(ii) Suppose we defined P to be the orthogonal projection on the subspace 

S of L^. Then, as is easily seen, (P/)(^) = xiOfiO ^ 

here x is the characteristic function of (0,oo). The operator P is also 
closely related to the Cauchy integral. Indeed, if F is the (unique) 
element in whose boundary function (according to Theorem 2.3) 


is P{f), then 



To prove this it suffices to verify that for any / G L^(IR) and any fixed 
z = X + iy £ , we have 


z = X + iy ^ R^, 



This is proved in the same way as the Poisson integral representation (10) 
except here we use the identity (11) instead of (12). The details may be 
left to the interested reader. Also, the reader might note the close analogy 
between this version of the Cauchy integral for the upper-half plane, and 
a corresponding version for the unit disc, as given in Example 2, Section 4 
of Chapter 4. 

(iii) In analogy with the periodic case discussed in Exercise 30 of Chap- 
ter 4, we define a Fourier multiplier operator T on R to be a linear 
operator on Z/^(R) determined by a bounded function m (the multi- 
plier), such that T is defined by the formula {T f){^) = rn{^)f{^) for 
any / G L^(R). The orthogonal projection P above is such an operator 
and its multiplier is the characteristic function x(^). Another closely 
related operator of this type is the Hilbert transform FI defined by 
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P = . Then H is a Fourier multiplier operator corresponding to the 

multiplier isign(^). Among the many important properties of H is its 
connection to conjugate harmonic functions. Indeed, for / a real-valued 
function in / and H{f) are, respectively, the real and imaginary 

parts of the boundary values of a function in the Hardy space. More 
about the Hilbert transform can be found in Exercises 9 and 10 and 
Problem 5 below. 


3 Constant coefficient partial differential equations 

We turn our attention to solving the linear partial differential equation 
(13) L{u) = /, 

where the operator L takes the form 



with Oq € C constants. 

In the study of the classical examples of L, such as the wave equation, 
the heat equation, and Laplace’s equation, one already sees the Fourier 
transform entering in an important way.^ For general L, this key role 
is further indicated by the following simple observation. If, for example, 
we try to solve this equation with both u and / elements in <S, then this 
is equivalent to the algebraic equation 

piOHo = m, 


where P(^) is the characteristic polynomial of / defined by 

no = a„(2^*C“- 

|Q:|<n 


This is because one has the Fourier transform identity 


Thus a solution u in the space S (if it exists) would be uniquely deter- 
mined by 




m 

no' 


“^See for example Chapters 5 and 6 in Book L 
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In a more general setting, matters are not so easy: aside from the ques- 
tion of defining (13), the Fourier transform is not directly applicable; 
also, solutions that we prove to exist (but are not unique!) have to be 
understood in a wider sense. 

3.1 Weak solutions 

As the reader may have guessed, it will not suffice to restrict our attention 
to those functions for which L{u) is defined in the usual way, but instead 
a broader notion is needed, one involving the idea of “weak solutions.” 
To describe this concept, we start with a given open set Q in R'* and 
consider the space which consists of the indefinitely differentiable 

functions^ having compact support in 11.® We have the following fact. 

Lemma 3.1 The space C^(fi) is dense in L‘^{Tl) in the norm || ■ ||L2(n)- 

The proof is essentially a repetition of that of Lemma 1.2. We take the 
precaution of modifying the definition of qm given there to be: gM{x) = 
f{x) if \x\ < M, d{x,fl‘^) > 1/M and |/(a:)| < M, and gnix) = 0 oth- 
erwise. Also, when we regularize gM-, we replace it with gM * <psi with 
5 < 1/2M. Then the support of gM * Ts is still compact and at a distance 
> 1/2M from 0,'^. 

We next consider the adjoint operator of L defined by 

L-= 

|ct|<n ^ 

The operator L* is called the adjoint of L because, in analogy with 
the definition of the adjoint of a bounded linear transformation given in 
Section 5.2 of the previous chapter, we have 

(14) {Lip,'il;) = {ip, L*'iIj) whenever (/?, G C'~( 11), 

where (■, ■) denotes the inner product on L‘^{Tl) (which is the restriction 
of the usual inner product on L^(R‘^)). The identity (14) is proved by 
successive integration by parts. Indeed, consider first the special case 
when L = djdxj^ and then L* = —djdxj. If we use Fubini’s theorem, 
integrating first in the Xj variable, then in this case (14) reduces to the 


® Indefinitely differentiable functions are also referred to as C°° functions, or smooth 
functions. 

®This means that the closure of the support of /, as defined in Section 1 of Chapter 2, 
is compact and contained in Q. 
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familiar one-dimensional formula 



with the integrated boundary terms vanishing because of the assumed 
support properties of ijj (or (p). Once established for L = d/dxj, 1 < j < 
n, then (14) follows for L = {d/dx)°‘ by iteration, and hence for general 
L by linearity. 

At this point we digress momentarily to consider besides CI^(Q) some 
other spaces of differentiable functions on Q that will be useful later. 
The space consists of all functions / on 0 that have continuous 

partial derivatives of order < n. Also, the space C”(0) consists of those 
functions on that can be extended to functions in that belong to 
Thus, in an obvious sense, we have the inclusion relation 

C^(fi) C C^(fi) C for each positive integer n. 

Returning to our partial differential operator T, it is useful to observe 
that the formula 


{Lu,ip) = {u,L*ip) 

continues to hold (with the same proof) if we merely assume that u G 
C”(n) without assuming it has compact support, while still supposing 

^ e 

In particular, if we have L{u) = / in the ordinary sense (sometimes 
called the “strong” sense), which requires the assumption that u € 
in order to define the partial derivatives entering in Lu, then we would 
also have 

(15) (/,V') = (w,T» for all V' € ^(("(17). 

This leads to the following important definition: if / G L^(f7), a function 
u G L^(n) is a weak solution of the equation Lu =/ in 17 if (15) holds. 
Of course an ordinary solution is always a weak solution. 

Significant instances of weak solutions that are not ordinary solutions 
already arise in elementary situations such as in the study of the one- 
dimensional wave equation. Here L{u) = {d'^u/dx'^) — {d'^u/dt^), so the 
underlying space is = {{x\,X 2 ) '■ with xi = x, X 2 = t}. Suppose, for 
example, we consider the case of the “plucked string.”’^ We are then 


^See Chapter 1 in Book 1. 
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looking at the solution of L{u) = 0 subject to the boundary conditions 
u(a;, 0) = f{x) and {du/dt){x,0) = 0 for 0 < x < tt, where the graph of 
/ is piecewise linear and is illustrated in Figure 2. 



Figure 2. Initial position of a plucked string 


If one extends / to [— tt, tt] by making it odd, and then to all of M 
by periodicity (of period 27r), then the solution is given by d’Alembert’s 
formula 

In the present case u is not twice continuously differentiable, and it is 
therefore not an ordinary solution. Nevertheless it is a weak solution. 
To see this, approximate / by a sequence of functions /„ that are C°° 
and such that fn^f uniformly on every compact subset of R.® If we 
define Un{x,t) as [/„(x + t) + /„(x — t)]/2, we can check directly that 
L{un) = 0 and hence {un,L*'tl^) = 0 for all if) G (7^(1^^), and thus by 
uniform convergence we obtain that {u,L*iIj) = 0 as desired. 

A different example illustrating the nature of weak solutions arises for 
the operator L = d/dx on R. If we suppose ft = (0, 1), then with u and 
/ in T^(n), we have that Lu = / in the weak sense if and only if there is 
an absolutely continuous function F on [0, 1] such that F{x) = u{x) and 
F'{x) = f{x) almost everywhere. For more about this, see Exercise 14. 

3.2 The main theorem and key estimate 

We now turn to the general theorem guaranteeing the existence of solu- 
tions of partial differential equations with constant coefficients 

Theorem 3.2 Suppose ft is a bounded open subset o/R'’*. Given a linear 
partial differential operator L with eonstant eoefficients, there exists a 


®One may write, for example, fn = / * ¥’l/ni where is the approximation to the 

identity, as in the proof of Lemma 1.2. 
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bounded linear operator K on sueh that whenever f G then 

L{Kf ) = f in the weak sense. 

In other words, u = K[f) is a weak solution to L{u) = f. 

The heart of the matter lies in an inequality that we state next, but 
whose proof (which uses the Fourier transform) is postponed until the 
next section. 

Lemma 3.3 There exists a eonstant c sueh that 

IIV’IIl^CO) < c||T*V'||L2(n) whenever G 

The usefulness of this lemma comes about for the following reason. 
If L is a finite-dimensional linear transformation, the solvability of L 
(the fact that it is surjective) is of course equivalent with the fact that 
its adjoint L* is injective. In effect, the lemma provides the analytic 
substitute for this reasoning in an infinite-dimensional setting. 

We first prove the theorem assuming the validity of the inequality in 
the lemma. 

Consider the pre-Hilbert space Ho = C^(f2) equipped with the inner 
product and norm 

((p,V') = (TV,L», llV-llg = ||T>||L2(n). 

Following the results in Section 2.3 of Chapter 4, we let H denote the 
completion of Hq- By Lemma 3.3, a Cauchy sequence in the || ■ ||o-norm 
is also Cauchy in the Z/^(H)-norm; hence we may identify H with a 
subspace of Also, L* , initially defined as a bounded operator 

from Ho to L^(H), extends to a bounded operator L* from H to Z/^(H) 
(by Lemma 1.3). For a fixed f G L^(H), consider the linear map io '■ 
C§^(n) C defined by 

£oW = ('tp, /) for ip G (H). 

The Cauchy-Schwarz inequality together with another application of 
Lemma 3.3 yields 

I4(V')I = l(V'./)l < llV'l|L2(n)||/||L2(n) 

< c||L*V’||L2(n)||/||L2(n) 

< c'llV'llo, 
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with c' = c||/||i 2 (f 2 ). Hence io is bounded on the pre-Hilbert space Tio. 
Therefore, i extends to a bounded linear functional on H (see Section 5.1, 
Chapter 4), and the above inequalities show that P|| < c||/||i 2 (f 2 ). By 
the Riesz representation theorem applied to £ on the Hilbert space Ti. 
(Theorem 5.3 in Chapter 4), there exists U such that 

£(^) = = {L*il;,L*U) for all i) G C^{n). 

Here (■, ■) denotes the extension to Ti of the initial inner product on Hq, 
and L* also denotes the extension of L* originally given on Hq. 

If we let u = L*U, then u G T^(H), and we find that 

^(^) = (^,/) = (LV,^) for all Ip G C'(j"(M'^). 

Hence 

{f,p,) = {u,L*p,) for all!/; G Co- (R"), 

and by definition, u is a weak solution to the equation Lu = / in H. If 
we let Kf = u, we see that once / is given, Kf is uniquely determined 
by the above steps. Since ||t/||o = ||I?|| < c||/||i 2 (f 2 ) we see that 

II-^^/IIlRO) = I^IIlRO) = \\L*U\\l'^(Q) = ||C||o < c||/||L 2 (f 2 ), 
whence K : Z/^(H) — > T^(H) is bounded. 

Proof of the main estimate 

To complete the proof of the theorem, we must still prove the estimate 
in Lemma 3.3, that is, 

IIV'IIlRO) < c||L>||L 2 (n) whenever ip G C^(H). 

The reasoning below relies on an important fact: if / has compact 
support in R, then /(^) initially defined for ^ G R extends to an entire 
function for (p = ^ + irj ^ C. This observation reduces the problem to an 
inequality about holomorphic functions and polynomials. 

Lemma 3.4 Suppose P{z) = + ■ ■ ■ + aiz -|- oq is a polynonial of de- 

gree m with leading coefficient 1. If F is a holomorphic function on C, 
then 

|C(0)p < — / |P(e*®)F(e*®)pd6'. 

271 Jo 
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Proof. The lemma is a consequence of the special case when P = 1 

(16) |f(o)|2<^^ ^ \F{e^^)\^d9. 

This assertion follows directly from the mean-value identity (8) in Sec- 
tion 2 with C = 0 r = 1, via the Cauchy-Schwarz inequality. With it 
we begin by factoring P: 

P{z) = (z - a) (2 - /3) = Pi{z)P2{z), 

I«I>1 I/3|<1 

where each product is finite and taken over the roots of P whose absolute 
values are > 1 and < 1, respectively. 

Note that |Pi(0)| = n|a|>i l“l ^ 

For P 2 we write 

{z- (3) = -(1 - ]3z)'il^is{z), 

where if/siz) = “Blaschke factors” that have the obvious 

property that they are holomorphic in a region containing the closed 
unit disc and |^/)^(e*®)| = 1; see also Chapter 8 in Book II. We write 
^2 = n|/ 3 |<i(l - ^ 2 ) and P = P 1 P 2 . Thus |P(0)| > I, while |P(e*®)| = 
|P(e'®)| for every 9. We now apply (16) to the function PF in place of 
F and find that 

|P(0)p < |P(0)P(0)p < — / \P{F^)F{F'^)\'^ d9 

Jo 

= \P{P^)F{FTd9, 

which gives the desired conclusion. 

We turn to the proof of the inequality ||'!/’|| < c||P*'!/’ll for all 
in the special case of one dimension, that is, fl C M. 

Suppose / is an function supported on the interval [— M, M], Then 

/ M 

/(x)e-2— 

-M 

whenever ^ G R. In fact, the above integral converges whenever f is re- 
placed hy f = ^ + ir] C, and we may extend / to a holomorphic func- 
tion of f in the whole complex plane. An application of the Plancherel 
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formula (for fixed rj) yields 

^OO 


J — ( 


We use this observation in the following context. We may assume (upon 
multiplying Z by a suitable constant) that 


L*= Y. (-1)' 




0<k<n 


dx 


where an = (27ri) If we let Q(0 = be its 

characteristic polynomial, then we note that 

whenever € C“(IR). 

If M is chosen so large that 11 C [— M, M], then our previous observation 
gives 

/ OO pOQ 

\Q{^ + *^)V’(C + / \L*'il;{x)\'^dx. 

OO J — OO 

Picking r] = i sin 6, and making a translation by cos 6 yields 

/ OO 

|Q(^ + cos 0 + z sin + cos 9+i sin 0)\'^d^ < 

-OO 

/ OO 

\L*il;{x)\‘^dx. 

-OO 


An application of Lemma 3.4 with F{z) = + z) and Q{^ + z) in place 

of P{z) then gives 


1 

^ — / |Q(^ + COS0 + zsin0)V’(^ + COS0 + isin0)pd0. 

27r Jo 

We now integrate in ^ over R, and on the right-hand side interchange 
the order of the ^ and 9 integrations; also by translation invariance we 
replace the integration in the ^ variable by that in the variable ^ -|- cos 9. 
Using (17) the result is 


r2iT 


L2(M) < 


27r 


< e 


47rM 


/ \Q{^ + i sin 9)'ip{^ + i sin 9)\‘^ d^d9 
) Jr 

/ |L*V’(x)p dx, 
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which by Plancherel’s identity proves the main lemma in the one-dimensional 
case. 

The higher dimensional case is a modification of the argument above. 

Let Q = be the characteristic polynomial of L* . 

Then we can choose a new set of orthogonal axes (whose coordinates we 
denote by (Ci, . . . ,6)) so that if ^ = (6,C') with = (^2, • • • ,?d), then 
after multiplying by a suitable constant 


n— 1 

( 18 ) = + 

j=0 

where are polynomials of (of degrees < n — j). 

To see that such a choice is possible, write Q = Qn + Q' , where Qn is 
homogeneous of degree n and Q' has degree < n. Then since we may 
assume Qn 7^ 0 there is (after multiplying Q by a suitable constant), 
a unit vector 7 so that Qn{l) = { 27 ri)~^. Then Qn{Q) = ( 27 r*)“"'r" if 
^ = yr, r e M. We can then take the ^i-axis to lie along 7, and the 
^2; • • • ,Cd-axes to be in mutually orthogonal directions, from which the 
form ( 18 ) is clear. 

Proceeding now as before we obtain 

^ ig(6 + + e^^,ai"d 0 

for each (■^i,'^') G R'*. An integration® then gives 

II^IIl2(m-^) ^ ^ j j + i sine, + i sine, d^de. 

If we suppose that the projection of the (bounded) set 0 . on the xi-axis 
is contained in [—M, M], we see as before that the right-hand side above 
is majorized by |L*'!/’(x)p dx, finishing the proof of Lemma 3.3 

and hence that of the theorem. 

4* The Dirichlet principle 

Dirichlet ’s principle arose in the study of the boundary- value problem 
for Laplace’s equation. Stated in the case of two dimensions it refers to 
the classical problem of finding the steady-state temperature of a plate 


®We note that by the rotational invariance of Lebesgue measure (Problem 4 in Chap- 
ter 2 and Exercise 26 in Chapter 3), integration in ^ can be carried out in the new 
coordinates as well. 
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whose boundary is exposed to a given temperature distribution. The 
issue raised is the following question, called the Dirichlet problem: 
If is a bounded open set in and / a continuous function on the 
boundary Oil, we wish to find a function u{xi,X 2 ) such that 


(19) 


( Art = 0 in n, 

\ u = f on dfl. 


Thus we need to determine a function that is (twice continuously 
differentiable) in fl, whose Laplacian^'^ is zero, and which is continuous 
on the closure of fl, with u|an = /■ 

With either or / satisfying special symmetry conditions, the solution 
to this problem can sometimes be written out explicitly. For instance, if 
n is the unit disc, then 

= ^ [ f{^)Pr{0 - dip, 

J —7T 


where Pr is the Poisson kernel (for the disc). We also obtained (in Books I 
and II) explicit formulas for the solution of the Dirichlet problem for some 
unbounded domains. For example, when D, is the upper half-plane the 
solution is 

u{x,y)= j Vy{x-t)f{t)dt, 

Jr 


where Vy{x) is the analogous Poisson kernel for the upper half-plane. A 
somewhat similar convolution formula was obtained when D is a strip. 
Also, the Dirichlet problem can be solved explicitly for certain D by using 
conformal mappings. 

In general, however, there are no explicit solutions, and other methods 
must be found. An idea that was used intially was based on an approach 
of wide utility in mathematics and physics: to find the equilibrium state 
of a system one seeks to minimize an appropriate “energy” or “action.” 
In the present case the role of this energy is played by the Dirichlet 
integral, which is defined for appropriate functions U by 

V{U) = [ |VD|2 = [ 

Jot Jot 

(Note the similarity with the expression of the “potential energy” in the 
case of the vibrating string in Chapters 3 and 6 of Book I.) In fact. 


dU 


dx\ 


-h 


dU 


dx9 


dxidx2- 


^^The Laplacian of a function u in R'* is defined by An = d^ujdx^. 

^^The close relation between conformal maps and the Dirichlet problem is discussed in 
the last part of Section 1 of Chapter 8, in Book 11. 
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that approach underlies the proof Riemann proposed for his well-known 


mapping theorem. About this early history R. Courant has written: 

Already some years before the rise of Riemann’s genius, 

C.F. Gauss and W. Thompson had observed that the bound- 
ary value problem of the harmonic differential equation Au = 

Uxx + Uyy = 0 for a domain G in the x,y- plane can be re- 
duced to the problem of minimizing the integral for the 
domain G, under the condition that the functions (j) admitted 
to competition have the prescribed boundary values. Because 
of the positive character of the existence of a solution 
for the latter problem was considered obvious and hence the 
existence for the former assured. As a student in Dirichlet ’s 
lectures, Riemann had been fascinated by this convincing ar- 
gument: soon afterwards he used it, under the name “Dirich- 
let’s Principle,” in a more varied and spectacular manner as 
the very foundation of his new geometric function theory. 

The application of Dirichlet’s principle was thought to have been jus- 
tified by the following simple observation: 

Proposition 4.1 Suppose there exists a funetion u G (7^(11) that mini- 
mizes T)[U) among all U G with U\gfi = f. Then u is harmonic 

in ft. 

Proof. For functions F and G in define the following inner- 

product 



We then note that T){u) = {u,u). If v is any function in C‘^{fl) with 
't’lao = 0, then for all e we have 


'D{u + ev) > T>{u) 


since u-\- ev and u have the same boundary values, and u minimizes the 
Dirichlet integral. We note, however, that 


V{u -I- ev) = V{u) -I- e^V{v) -|- e('u, v) + e{v, u). 


Hence 


e^V{v) -I- e{u, v) -|- e(u, u) > 0, 
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and since e can be both positive or negative, this can happen only if 
Re(u, f ) = 0. Similarly, considering the perturbation u + iev, we find 
Im(n, v) = 0, and therefore {u, v) = 0. An integration by parts then pro- 
vides 


0 = (w, v) 



{Au)v 


for all V G with v\gQ = 0. This implies that Am = 0 in fl, and of 

course u equals / on the boundary. 


Nevertheless, several serious objections were later raised to Dirich- 
let’s principle. The first was by Weierstrass, who pointed out that it 
was not clear (and had not been proved) that a minimizing function for 
the Dirichlet integral exists, so there might simply be no winner to the 
implied competition in Proposition 4.1. He argued by analogy with a 
simpler one-dimensional problem: that of minimizing the integral 

D{(p) = j \xip'{x)'\^dx 

among all functions on [—1, 1] that satisfy </?( — !) = —1 and (/?(!) = 1. 
The minimum value achieved by this integral is zero. To verify this, let 
V' be a smooth non-decreasing function on R that satisfies 'ip{x) = 1 for 
j; > 1, and = — 1 if x < —1. For each 0 < e < 1, we consider the 
function 

{ 1 if e < X, 

^(x/e) if — e < X < e, 

— 1 if X < — e. 


Then satisfies the desired constraints, and if M denotes a bound for 
the derivative of we find 


D{ip^) = J |x|^|e ^'i/;'(x/e)pdx 
j |V’'(a:/e)pdx 


< 


< 2eM^ 


In the limit as e tends to 0, we find that the minimum value of the integral 
D{(p) is zero. This minimum value cannot be reached by a function 
satisfying the boundary conditions, since D{(p) = 0 implies p'ix) = 0 and 
thus ip is constant. 


4*. The Dirichlet principle 


233 


A further objection was raised by Hadamard, who remarked that 'Diu) 
may be infinite even for a solution u of the boundary value problem: 
thus, in effect, there may simply be no competitors who qualify for the 
competition! 

To illustrate this point, we return to the disc, and consider the function 

OO 

/( 0 ) = /„( 0 ) = 

n=0 

for a > 0. This function first appeared in Chapter 4 of Book I, where it 
is shown that fa is continuous but nowhere differentiable if a < 1. The 
solution of the Dirichlet problem on the unit disc with boundary value 
fa is given by the Poisson integral 

CXD 

u{r,e) = 

n=0 


However, the use of polar coordinates gives 


du 

2 

du 

2 

du, 

^ 1 

du, 

dxi 

+ 

dX2 

— 

dr 


de 


Thus 




du ^ 1 

dr 



dOrdr 


where Dp is the disc of radius 0 < p < 1 centered at the origin. Since 


du 


^ 2’"2 


du 




dr ^ dO 

applications of Parseval’s identity lead to 



n=0 


which tends to infinity as p — > 1 if a < 1/2. 

One can formulate this objection in a more precise way by appealing 
to the result in Exercise 20. 
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Despite these significant difficulties, Dirichlet’s principle can indeed be 
validated, if applied in the appropriate way. A key insight is that the 
space of competing functions arising in the proof of the above proposition 
is itself a pre-Hilbert space, with inner product (■,■) given there. The 
desired solution lies in the completion of this pre-Hilbert space, and this 
requires the theory for its analysis. These ideas were clearly not 
available at the time Dirichlet’s principle was first formulated and used. 

In what follows we shall describe how these additional concepts can 
be exploited. We will begin our presentation in the more general d- 
dimensional setting, but conclude with the application of these tech- 
niques to the solution of the two-dimensional problem (19). As an impor- 
tant preliminary matter we start with the study of some basic properties 
of harmonic functions. 


4.1 Harmonic functions 


Throughout this section D will denote an open subset of R'^. A function u 
is harmonic in D if it is twice continuously differentiable^^ and u solves 


A 

Am = 2 _^ = 0. 


j=i J 


We shall see that harmonic functions can be characterized by a number 
of equivalent properties. Adapting the terminology used in Section 3, 
we say that u is weakly harmonic in D if 


(20) (m, Ai/j) = 0 for every ip € C“(D). 


Note that the left-hand side of (20) is well-defined for any u that is inte- 
grable on compact subsets of D. Thus, in particular, a weakly harmonic 
function needs to be defined only almost everywhere. Clearly, however, 
any harmonic function is weakly harmonic. 

Another notion is the mean-value property generalizing the iden- 
tity (9) in Section 2 for holomorphic functions. A continuous function u 
defined in D satisfies this property if 


( 21 ) 


m(xo) = 


m{B) 


u{x) dx 


for each ball B whose center is xq and whose closure B is contained in D. 


other words, u is in C'^(Q) in the notation of Section 3.1. 

^^Note that in the case of one dimension, harmonic functions are linear and so their 
theory is essentially trivial. 
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The following two theorems give alternative characterizations of har- 
monic functions. Their proofs are closely intertwined. 

Theorem 4.2 If u is harmonic in ft, then u satisfies the mean-value 
property (21). Conversely, a continuous function satisfying the mean- 
value property is harmonic. 

Theorem 4.3 Any weakly harmonic function u in ft can he corrected 
on a set of measure zero so that the resulting function is harmonic in ft. 

The above statement says that for a given weakly harmonic function u 
there exists a harmonic function u, so that u(x) = u{x) for a.e. x Q ft. 
Notice since u is necessarily continuous it is uniquely determined by u. 

Before we prove the theorems, we deduce a noteworthy corollary. It is 
a version of the maximum principle. 

Corollary 4.4 Suppose ft is a bounded open set, and let dft = ft — ft 
denote its boundary. Assume that u is continuous in ft and is harmonic 
in ft. Then 

max|u(a:)| = max |u(a:)|. 

2,60 ' xean' 

Proof. Since the sets ft and dft are compact and u is continuous, the 
two maxima above are clearly attained. We suppose that max^gj^ l^(2i)| 
is attained at an interior point xq ^ ft, for otherwise there is nothing to 
prove. 

Now by the mean-value property, |w(a;o)| < Is 
some point x’ a B we had |w(x')| < |u(a:o)|, then a similar inequality 
would hold in a small neighborhood of x' , and since |n(j;)| < |w(ieo)| 
throughout B, the result would be that |u(a:)| dx < |u(a:o)|, which 

is a contradiction. Hence |w(x)| = |M(aio)| for each x a B. Now this is 
true for each ball B^ of radius r, centered at xq, such that Br C ft. Let 
ro be the least upper bound of such r; then intersects the boundary 
ft at some point x. Since |w(x)| = |w(xo)| for all x G B^., r < xq, it follows 
by continuity that |w(x)| = |tt(xo)|, proving the corollary. 

Turning to the proofs of the theorems, we first establish a variant 
of Green’s formula (for the unit ball) that does not explicitly involve 
boundary terms. Here u, v, and rj are assumed to be twice continuously 
differentiable functions in a neighborhood of the closure of B, but rj is 
also supposed to be supported in a compact subset of B. 


'^^The more usual version requires integration over the (boundary) sphere, a topic 
deferred to the next chapter. See also Exercises 6 and 7 in that chapter. 
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Lemma 4.5 We have the identity 


{vAu — uAv)r] dx = / u^Vv ■ ’Vrj) — v^Vu ■ ’Vrj) dx. 


IB 


IB 


Here Vu is the gradient of u, that is, Vn 


/ du du du 

dxi ’ dx2 ’ ■ * * ’ dxd J 


and 


d 

Vv ■ Vrj = 

j=i 


dv dr} 
dxj dxj ’ 


with Vu ■ Vr] defined similarly. 

In fact, by integrating by parts as in the proof of (14) we have 

[du [dv f , 

/ — — vr}dx=— / u— — rjdx— / uv— — dx. 

Jb oxj Jb dxj Jb dxj 

We then repeat this with u replaced by dujdxj., and sum in j to obtain 


{Au)vr]dx=— / iy u ■ V v)ri dx — / (Vu ■ 'Vr])v dx. 

Jb Jb 


This yields the lemma if we subtract from this the symmetric formula 
with u and v interchanged. 

We shall apply the lemma when it is a given harmonic function, while 
V is one of the three following “test” functions: first, v{x) = 1; second, 
v{x) = |xp; and third, v{x) = if d > 3, while v{x) = log |a:| if 

d = 2. The relevance of these choices arises because Av = 0 in the first 
case, while Av is a non-zero constant in the second case; also v in the 
third case is a constant multiple of a “fundamental solution,” and in 
particular v{x) is harmonic for x ^ 0. 

When v{x) = 1, we take r] = rjJ, where riJ{x) = 1 for |x| < 1 — e, 
r]J{x) = 0 for |a;| > 1, and |V77 +(j:)| < c/e. We accomplish this by setting 
r]J{x) = X ^ 1^1 for 1 — e < |x| < 1, where x is a fixed function 

on [0, 1] that equals 1 in [0, 1/4] and equals 0 in [3/4, 1]. A picture of -qj 
is given in Figure 3. 

Since u is harmonic, we see that with u = 1, Lemma 4.5 implies 


( 22 ) 

Next we take v{x) 
lemma yields: 



Vit ■ V? 7 + dx = 0. 


I B 


jxp; then clearly Av = 2d, and with q = qj the 


x\‘^yu-'S7q[)dx — 2 / u{x ■ V q[ ) dx . 
B Jb 
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However, since V? 7 + is supported in the spherical shell = {x : 1 — e < 
|ai| < 1 }, we see that 


|a;p(Vu ■ Vry^) dx = / (Vit ■ Vry^) dx + 0(e), 


I B 


IB 


and hence by ( 22 ) we get 


(23) 


d / udx = — lim 

Jb ^^ 0 . 


u{x ■ V? 7 +) dx. 


B 


We finally turn to x(x) = |x|“'^+^, when d > 3, and calculate (At))(x) 
for X 7 ^ 0 to see that it vanishes there. In fact, since cl|x|/clxj = Xj/|x|, 
we note that 


(9|x|° 

dxj 


= axj\x 


a— 2 


and 


5^|x|“ 

TSf 


a|x|“ ^ + a(a — 2 )x^|x|“ ^ 


Upon adding in j, we obtain that A(|x|“) = [da + a{a — 2)]|x|““^, and 
this is zero if a = — d + 2 (or a = 0). A similar argument shows that 
A(log jxj) = 0 when d = 2 and x 7 ^ 0. 

We now apply the lemma with this v and rj = defined as follows: 


r]e{x) = 1 - x(|a;|/e) 
Veix) = 1 

77e(x) = r]+{x) = x(^ 




for jxj < e, 

for e < jxj < 1 — e, 

for 1 — e < jxj < 1. 


The picture for is as follows (Figure 4): 
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Figure 4. The function rje 


We note that |V? 7 e| is 0{l/e) throughout. Now both u and v are 
harmonic in the support of r]^, and in this case is supported only 
near the unit sphere (in the shell ) or near the origin (in the ball 
= {|x| < e}). Thus the right-hand side of the identity of the lemma 
gives two contributions, one over and the other over We consider 
the first contribution (when d > 3); it is 

/ -Vrjedx — / \x\~'^'^^{Vu ■ Vrjf) dx. 

Js+ Jst 

Now the first integral is (— d+ 2) fg+ u\x\~‘^{x ■ Vry^) dx, which by (23) 
tends to c u dx as e — > 0, where c is the constant (2 — d)d, since \x\~'^ — 
1 = 0(e) over The second term tends to zero as e — > 0 because of (22) 
and the fact that the integrand there is supported in the shell 5'+. A 
similar argument for d = 2, with v{x) = log|a:|, yields the result with 
c = 1. 

To consider the contribution near the origin, that is, over B^^ we tem- 
porarily make the additional assumption that u{0) = 0. Then because 
of the differentiability assumption satisfied by a harmonic function, we 
have u{x) = 0(|a:|) as |a;| — > 0. Now over we have two terms, the first 
being uV(|a;|“‘^'*‘^)Vr 7 e dx, which is majorized by 



0(e)lxr‘^+^0(l/£)dx < O 



< 0(e), 


because of (8) in Section 2 of Chapter 2. This term tends to 0 with e. 
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The second term is |a;| ■ Vr/e) dx, which is majorized by 



using the result just cited. We have used the fact that Vu is bounded 
and V77e is 0(l/e) throughout B. Letting e — > 0 we see that this term 
tends to zero also. A similar argument works when d = 2. 

Thus we have proved that if u is harmonic in a neighborhood of the 
closure of the unit ball B, and u{0) = 0, then fgudx = 0. We can drop 
the assumption u(0) = 0 by applying the conclusion we have just reached 
to u{x) — u{0) in place of u{x). Therefore we have achieved the mean- 
value property (21) for the unit ball. 

Now suppose Bj.{xo) = {x : \x — xo\ < r} is the ball of radius r cen- 
tered at xq, and consider U{x) = u{xq + rx). If we suppose that u is har- 
monic in Bj.{xo), then clearly U is harmonic in the unit ball (indeed, the 
property of being harmonic is unchanged under translations x ^ x + xq 
and dilations x — > rx, as is easily verified). Thus if u were supported in fl, 
and Bj.{xq) C fl, then by the result just proved ?7(0) = U{x) dx, 

which means that 



by the relative invariance of Lebesgue measure under dilations and trans- 
lations. This establishes (21) in general. 

The converse property 

To prove this, we first show that the mean-value property allows a useful 
extension of itself. For this purpose, we fix a function ip{y) that is contin- 
uous in the closed unit ball {|y| < 1} and is radial (that is, ip{y) = ^(| 2 /|) 
for an appropriate $), and extend (p to be zero when \y\ > 1. Suppose 
in addition that f ip{y) dy = 1. We then claim the following: 

Lemma 4.6 Whenever u satisfies the mean-value property (21) in fl, 
and the elosure of the ball {x : |x — xo| < r} lies in fl, then 


(24) 


u{xo)= u{xo-ry)(p{y)dy= u{xo - y)ipr{y) dy = {u * (pr){xo), 
Jr‘^ Js.'i 

where <Pr{y) = r~‘^ip{y/r). 
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That the second of the two identities holds is an immediate consequence 
of the change of variables y — > y/r\ the rightmost equality is merely the 
definition of u * tpr- 

We can prove (24) as a consequence of a simple observation about 
integration. Let ipiv) be another function on the ball {jt/l < 1}, which 
we assume is bounded. For each N, a large positive integer, denote by 
B{j) the ball {|y| < j /N). Recall that ip{y) = ^'(|y|). Then 


(25) 


J v’{y)^{y)dy 


N 

lim $ 

3 = 1 





Hy) dy. 


To verify this, note that the left-hand side of (25) equals 


N 

/ y^{y)^{y)dy. 

JB{j)-BU-l) 

However, sup^<j<f^s-apy^B{j)-B(j-i) \v{y) “ ^07^)1 = which tends 
to zero as N ^ oo, since ip is radial, continuous, and ip{y) = ^*(|y|)- Thus 
the left-hand side of (25) differs from ^{j/N) tpiy) dy 

by at most cn \'ip{y)\ dy, proving (25). 

We now use this in the case where 'tp{y) = u{xo — ry) and ip is as before. 
Then 

/ u{xq - ry)ip{y) dy = lim ^ ) / u{xo-ry)dy. 

J Jbu)-bu-i) 

However, it follows from the mean- value property assumed for u that 


u{xo - ry) dy = u{xo)[m{B{j)) - m{B{j - 1))]. 


JBU)-B{j-i) 

Therefore, the right-hand side above equals 

A 

u(xq) lim y $ 


i=i 


dy, 


BU)-BU-i) 


and this is u{xq) if we use (25) again, this time with ip = 1, and recall 
that f ip{y) dy = 1. We have therefore proved the lemma. 

We see from this that every continuous function which satisfies the 
mean- value property is its own regularization! To be precise, we have 


( 26 ) 


u{x) = [u* iPr){x) 
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whenever x Q fl and the distance from x to the boundary of is larger 
than r. If we now require in addition that € ^“{12/1 < 1}, then by the 
discussion in Section 1 we conclude that u is smooth throughout Q.. 

Let us now establish that such functions are harmonic. Indeed, by 
Taylor’s theorem, for every xq £ 

d ^ d 

(27) u{xq + x) - u(xo) = ^ ttjXj + 2 ajkXjXk + e(x), 

j = l j,k=l 


where e(x) = 0(|xp) as |x| — > 0. We note next that Xj dx = 0 and 
XjXfe dx = 0 for all j and k with k ^ j. This follows by carrying 
out the integrations first in the Xj variable and noting that the integral 
vanishes because Xj is an odd function. Also by an obvious symmetry 
x^ dx = x| dx, and by the relative dilation-invariance (see 
Section 3, Chapter 1) these are equal to J^^|^^(xi/r)^ dx = 
xf dx = with c > 0. We now integrate both sides of (27) 

over the ball {|x| < r}, divide by r'^, and use the mean- value property. 
The result is that 


d 



i=i 



(Au)(xo) = O 




(x)| dx 


0{r^). 


Letting r — > 0 then gives Au(xo) = 0. Since xq was an arbitrary point 
of n, the proof of Theorem 4.2 is concluded. 


Theorem 4.3 and some corollaries 

We come now to the proof of Theorem 4.3. Let us assume that u is 
weakly harmonic in fl. For each e > 0 we define fig to be the set of 
points in ft that are at a distance greater than e from its boundary: 

fie = {x € : d{x,dfl) > e}. 

Notice that fie is open, and that every point of ft belongs to fie if e 
is small enough. Then the regularization u * tpr = Uj. considered in the 
previous theorem is defined in fie, for r < e, and as we have noted is a 
smooth function there. We next observe that it is weakly harmonic in 
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He. In fact, for i/j € C^{Ue) we have 



u{x — ry)ip{y) dy j {A’ip){x) dx 


= [ ^{y) ( [ 


u{x — ry){Aijj){x) dx I dy, 


by Fubini’s theorem, and the inner integral vanishes for y, \y\ < 1, be- 
cause it equals {u, Atpr), with 'ipj. = 'tl;{x + ry). Thus we have 


(n * (/3r, Alj}) = 0, 


and hence u * (pr is weakly harmonic. Next, since this regularization is 
automatically smooth it is then also harmonic. Moreover, we claim that 


(28) 


{U*iprj{x) = {U*ipr^){x) 


whenever x € fie and ri + r 2 < e. Indeed, {u * (pr^) * (pr^ = u* ipr^ as 
we have shown in (26) above. However convolutions are commutative 
(see Remark (6) in Chapter 2); thus {u * ipn) * P‘r 2 = {u* ipr^ ipr^ = 
u * (pr 2 ) and (28) is proved. 

Now we can let ri tend to zero, while keeping r 2 fixed. We know by the 
properties of approximations to the identity that u * tpn {x) u{x) for 
almost every x in hence u{x) equals Ur 2 (x) for almost every x € 
Thus u can be corrected on (setting it equal to Ura), so that it becomes 
harmonic there. Now since e can be taken arbitrarily small, the proof of 
the theorem is complete. 

We state several further corollaries arising out of the above theorems. 

Corollary 4.7 Every harmonic function is indefinitely differentiable. 

Corollary 4.8 Suppose {un} is a sequence of harmonic functions in H 
that converges to a function u uniformly on compact subsets of fl as 
n — > oo. Then u is also harmonic. 

The first of these corollaries was already proved as a consequence 
of (26). For the second, we use the fact that each Un satisfies the mean- 
value property 



whenever H is a ball with center at xg, and B C 0.. Thus by the uniform 
convergence it follows that u also satisfies this property, and hence u is 
harmonic. 
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We should point out that these properties of harmonic functions on 
are reminiscent of similar properties of holomorphic functions. But 
this should not be surprising, given the close connection between these 
two classes of functions in the special case d = 2. 

4.2 The boundary value problem and Dirichlet’s principle 

The d-dimensional Dirichlet boundary value problem we are concerned 
with may be stated as follows. Let D be an open bounded set in 
Given a continuous function / defined on the boundary dfl, we wish to 
find a function u that is continuous in fl, harmonic in fl, and such that 
u = f on d^l. 

An important preliminary observation is that the solution to the prob- 
lem, if it exists, is unique. Indeed, if ui and U 2 are two solutions 
then u\ — U 2 is harmonic in fl and vanishes on the boundary. Thus by 
the maximum principle (Corollary 4.4) we have u\ — U 2 = 0, and hence 

Ui = U2- 

Turning to the existence of a solution, we shall now pursue the ap- 
proach of Dirichlet’s principle outlined earlier. 

We consider the class of functions and equip this space with 

the inner product 



where of course 


rl 



With this inner product, we have a corresponding norm given by 
||u|p = {u,u). We note that ||u|| = 0 is the same as Vu = 0 through- 
out D, which means that u is constant on each connected component of 


D. Thus we are led to consider equivalence classes in C^(D) of elements 


modulo functions that are constant on components of D. These then 
form a pre-Hilbert space with inner product and norm given as above. 
We call this pre-Hilbert space TYq. 

In studying the completion Ti of TYq and its applications to the bound- 
ary value problem, the following lemma is needed. 

Lemma 4.9 Let D he an open bounded set in M'^. Suppose v belongs to 
C^(D) and v vanishes on dLl. Then 


(29) 
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Proof. This conclusion could in fact be deduced from the considera- 
tions given in Lemma 3.3. We prefer to prove this easy version separately 
to highlight a simple idea that we shall also use later. It should be noted 
that the argument yields the estimate cq < where d{il) is the 

diameter of ft. 

We proceed on the basis of the following observation. Suppose / is a 
function in where I = {a,b) is an interval in R. Assume that / 

vanishes at one of the end-points of I. Then 

(30) ljf{t)\^dt<\I\^ ljf{t)\^dt, 

where |/| denotes the length of I. 

Indeed, suppose /(a) = 0. Then /(s) = f'{t) dt, and by the Cauchy- 

Schwarz inequality 

i/(s)r<i/i r\f{t)\^dt<\i\ [ \f{t)\^dt. 

Ja J I 

Integrating this in s over / then yields (30). 

To prove (29), write x = {xi,x') with xi € M and x' G and ap- 
ply (30) to / defined by f{xi) = v{xi,x'), with x' fixed. Let J{x') 
be the open set in R that is the corresponding slice of ft given by 
{xi G R : {x\,x') G 11}. The set J{x') can be written as a disjoint union 
of open intervals Ij. (Note that in fact f{xi) vanishes at both end-points 
of each Ij.) For each j, on applying (30) we obtain 

/ |f(a;i, x')p dxi < |/jp / \f7v{xi,x')\‘^ dxi. 

Jij Jij 

Now since \Ij\ < d{ft), summing over the disjoint intervals Ij gives 

/ \v{xi,x')\^ dxi < d{fl)^ / \Vv{xi,x')\^ dxi, 

J J{x') J J{x') 

and an integration over x' G R“* then leads to (29). 

Now let So denote the linear subspace of (7^(11) consisting of functions 
that vanish on the boundary of II. We note that distinct elements of 5*0 
remain distinct under the equivalence relation defining Ho (since con- 
stants on each component that vanish on the boundary are zero) , and so 
So may be identified with a subspace of Ho. Denote by S the closure in 
H of this subspace, and let Ps be the orthogonal projection of H onto S. 
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With these preliminaries out of the way, we first try to solve the bound- 
ary value problem with / given on dfl under the additional assumption 
that / is the restriction to dQ of a function F in (How this 

additional hypothesis can be removed will be explained below.) Fol- 
lowing the prescription of Dirichlet’s principle, we seek a sequence {un} 
with Un £ (7^(17) and Un\dQ = F|an, such that the Dirichlet integrals 
||un|P converge to a minimum value. This means that Un = F — Vn, 
with Vn £ Sq, and that lim^^oo ||wn|| minimizes the distance from F to 
Sq. Since S = Sq, this sequence also minimizes the distance from F to 
S in H. 

Now what do the elementary facts about orthogonal projections teach 
us? According to the proof of Lemma 4.1 in the previous chapter, we 
conclude that the sequence {un}, and hence also the sequence {u„}, 
both converge in the norm of 7Y, the former having a limit Ps{F). Now 
applying Lemma 4.9 to — Vm we deduce that {u^} and {u„} are also 
Cauchy in the L^(D)-norm, and thus converge also in the L^-norm. Let 
u = lim^^oo Un- Then 


(31) 


F-Ps{F). 


u = 


We see that u is weakly harmonic. Indeed, whenever ^ £ C'(j“(D), then 
ip G S, and hence by (31) {u,ip) = 0. Therefore {un,ip) — > 0, but by 
integration by parts, as we have seen. 



As a result, {u, ^ip) = 0, and so u is weakly harmonic and thus can be 
corrected on a set of measure zero to become harmonic. 

This is the purported solution to our problem. However, two issues 
still remain to be resolved. 

The first is that while u is the limit of a sequence {un} of continuous 
functions in H and ttnian = /i for each n, it is not clear that u itself is 
continuous in H and M|an = /• 

The second issue is that we restricted our argument above to those 
/ defined on the boundary of H that arise as restrictions of functions 
in ci(n). 

The second obstacle is the easier of the two to overcome, and this can 
be done by the use of the following lemma, applied to the set T = dO-. 

Lemma 4.10 Suppose T is a compact set in and f is a continuous 
function on T. Then there exists a sequence {F)i} of smooth functions 
on so that Fn ^ f uniformly on T. 
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In fact, supposing we can deal with the first issue raised, then with the 
lemma we proceed as follows. We find the functions Un that are har- 
monic in n, continuous on fl, and such that Un\dn = Fn\du- Now since 
the {Fn} converges uniformly (to /) on dft, it follows by the maximum 
principle that the sequence {Un} converges uniformly to a function u 
that is continuous on fl, has the property that u|an = /, and which is 
moreover harmonic (by Corollary 4.8 above). This achieves our goal. 

The proof of Lemma 4.10 is based on the following extension principle. 

Lemma 4.11 Let f be a eontinuous function on a compact subset T of 
Then there exists a function G on IR.“* that is continuous, and so that 
G|ar = /. 

Proof. We begin with the observation that if Kq and Ki are two 
disjoint compact sets, there exists a continuous function 0 < g{x) < 1 on 
which takes the value 0 on Kq and 1 on iLi. Indeed, if d{x, Ll) denotes 
the distance from a; to 11, we see that 

^ d{x,Ko) 

^ ^ d{x, Kq) + d{x, Ki) 

has the required properties. 

Now, we may assume without loss of generality that / is non-negative 
and bounded by 1 on T. Let 

Kq = {x G T : 2/3 < f{x) < 1} and ILi = {x G T : 0 < f{x) < 1/3}, 

so that Kq and Ki are disjoint. Clearly, the observation before the 
lemma guarantees that there exists a function 0 < Gi{x) < 1/3 on 
which takes the value 1/3 on Kq and 0 on iLi. Then we see that 

0 < f{x) — Gi{x) < - for all a: G T. 

We now repeat the argument with / replaced by / — Gi. In the first 
step, we have gone from 0</<lto0</ — Gi< 2/3. Consequently, 
we may find a continuous function G 2 on so that 

0 < f{x) - Gi{x) - G 2 {x) < on T, 

and 0 < G 2 < ^|. Repeating this process, we find continuous functions 
Gn on R"^ such that 

0<f{x)-Gi{x) GAf(a:) < on T, 
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and 0 < Gn < | (f)^ ^ define 


OO 


G = Y,Gn 


then G is continuous and equals / on T. 

To complete the proof of Lemma 4.10, we argue as follows. We regu- 
larize the function G obtained in Lemma 4.11 by defining 



with ipe{y) = e~'^(p{y/e), where 93 is a non- negative function sup- 

ported in the unit ball with f ip{y) dy = 1. Then each iL is a C°° func- 
tion. However, 



Since the integration above is restricted to \x — y\ < e, then if j; G T, we 
see that 



< sup \G{x)-G{y)\. 


The last quantity tends to zero with e by the uniform continuity of G 
near T, and if we choose e = 1 /n we obtain our desired sequence. 

The two-dimensional theorem 

We now take up the problem of whether the proposed solution u takes 
on the desired boundary values. Here we limit our discussion to the case 
of two dimensions for the reason that in the higher dimensional situation 
the problems that arise involve a number of questions that would take 
us beyond the scope of this book. In contrast, in two dimensions, while 
the proof of the result below is a little tricky, it is within the reach of the 
Hilbert space methods we have been illustrating. 

The Dirichlet problem can be solved (in two dimensions as well as 
in higher dimensions) only if certain restrictions are made concerning 
the nature of the domain D. The regularity we shall assume, while not 
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optimal/® is broad enough to encompass many applications, and yet 
has a simple geometric form. It can be described as follows. We fix an 
initial triangle Tq in To be precise, we assume that Tq is an isosceles 
triangle whose two equal sides have length and make an angle a at 
their common vertex. The exact values of i and a are unimportant; 
they may both be taken as small as one wishes, but must be kept fixed 
throughout our discussion. With the shape of Tq thus determined, we 
say that T is a special triangle if it is congruent to Tq, that is, T arises 
from To by a translation and rotation. The vertex of T is defined to be 
the intersection of its two equal sides. 

The regularity property of we assume, the outside-triangle con- 
dition, is as follows: with £ and a fixed, for each x in the boundary of 
ft, there is a special triangle with vertex x whose interior lies outside ft. 
(See Figure 5.) 



Figure 5. The triangle Tq and the special triangle T 


Theorem 4.12 Let ft be an open bounded set in that satisfies the 
outside-triangle eondition. If f is a eontinuous funetion on dft, then the 
boundary value problem Au = 0 with u eontinuous in ft and n|an = f is 
always uniquely solvable. 

Some comments are in order. 

(1) If ft is bounded by a polygonal curve, it satisfies the conditions of 
the theorem. 

(2) More generally, if ft is appropriately bounded by finitely many Lips- 
chitz curves, or in particular curves, the conditions are also satisfied. 

(3) There are simple examples where the problem is not solvable: for 
instance, if ft is the punctured disc. This example of course does not 
satisfy the outside-triangle condition. 


^®The optimal conditions involve the notion of capacity of sets. 
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(4) The conditions on D, in this theorem are not optimal: one can con- 
struct examples of ft when the problem is solvable for which the above 
regularity fails. 

For more details on the above, see Exercise 19 and Problem 4. 

We turn to the proof of the theorem. It is based on the following 
proposition, which may be viewed as a refined version of Lemma 4.9 
above. 

Proposition 4.13 For any bounded open set ft in that satisfies the 
outside-triangle eondition there are two constants ci < 1 and C 2 > 1 such 
that the following holds. Suppose z is a point in ft whose distance from 
dfl is 5. Then whenever v belongs to (7^(11) and v\gQ = 0, we have 



(32) 


The bound C can be chosen to depend only on the diameter of ft and the 
parameters i and a which determine the triangles T. 



Figure 6. The situation in Proposition 4.13 


Let us see how the proposition proves the theorem. We have already 
shown that it suffices to assume that / is the restriction to dfl of an 
F that belongs to C'^(n). We recall we had the minimizing sequence 
Un = F — Vn, with Vn G (7^(11) and = 0. Moreover, this sequence 

converges in the norm of Tl and Lf[fl) to a limit u, such that u = F — v 
is harmonic in ft. Then since (32) holds for each Vn, it also holds for 
V = F — u; that is. 


(33) 
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To prove the theorem it suffices, in view of the continuity of u in to 
show that if y is any fixed point in dfl, and z is a variable point in D,, 
then u{z) — > f{y) as z ^ y. Let 6 = 5{z) denote the distance of z from 
the boundary. Then 5{z) < \z — y\ and therefore 5{z) — > 0 as z — > j/. 

We now consider the averages of F and u taken over the discs centered 
at z of radius ciS{z} (recall that ci < 1). We denote these averages 
by Av(T)(z) and Av(u)(z), respectively. Then by the Cauchy- Schwarz 
inequality, we have 

|Av(F)(z) - Av(u)(z)p < ^ / \F- up dx, 

nciSy 

which by (33) is then majorized by 


C f \\7{F -u)\^dx. 

The absolute continuity of the integral guarantees that the last integral 
tends to zero with <5, since m{Bc 2 s) — ^ 0. However, by the mean- value 
property, Av(u)(z) = u(z), while by the continuity of F in H, 


Av(T)(z) 


1 

m{Bc-,s{z)) 



F{x) dx 


f{y)^ 


because = / and z ^ y. Altogether this gives u{z) — > /(y), and the 
theorem is proved, once the proposition is established. 

To prove the proposition, we construct for each z € H whose distance 
from dO. is <5, and for 5 sufficiently small, a rectangle R with the following 
properties: 

(1) R has side lengths 2ci5 and M5 (with ci < 1/2, M < 4). 

(2) Bc,s{z) C R. 

(3) Each segment in i?, that is parallel to and of length equal to the 
length of the long side, intersects the boundary of H. 

To obtain R we let yhea. point in dO. so that 5 = |z — yj, and we apply 
the outside-triangle condition at y. As a result, the line joining z with 
y and one of the sides of the special triangle whose vertex is at y must 
make an angle /? < tt. (In fact P < n — al2, as is easily seen.) Now after 
a suitable rotation and translation we may assume that y = 0 and that 
the angle going from the X 2 -axis to the line joining z to 0 is equal to the 
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Figure 7. Placement of the rectangle R 


angle of the side of the triangle to the X2-axis. This angle can be taken 
to be 7, with 7 > a/4. (See Figure 7.) 

There is an alternate possibility that occurs with this figure reflected 
through the 2:2- axis. 

With this picture in mind we construct the rectangle R as indicated 
in Figure 8. 

It has its long side parallel to the a:2-axis, contains the disc Bc^s{z), 
and every segment R parallel to the a:2-axis intersects the (extension) of 
the side of the triangle. 

Note that the coordinates of z are (— (5sin7,5cos7). We choose ci < 
sin 7, then lies in the same (left) half-plane as z. 

We next focus our attention on two points: Pi, which lies on the xi- 
axis at the intersection of this axis with the far side of the rectangle; and 
P2, which is at the corner of that side of the rectangle, that is, at the 
intersection of the (continuation) of the side of the outside triangle and 
the further side of the rectangle. The coordinates of Pi are (—a, 0), where 
a = dci -I- (isin7. The coordinates of P2 are (—a, — Note that the 
distance of P2 from the origin is a/sin7, which is 5 + Ci(5/sin7 < 2S, 
since ci < sin 7. 

Now we observe that the length of the larger side of the rectangle is 
the sum of the part that lies above the xi-axis and the part that lies 
below. The upper part has length the sum of the radius of the disc plus 
the height of z, and this is Ci5 -|- 5 cos 7 < 26. The lower part has length 
equal to a/ tan 7, which is 5 cos 7 -|- < 26, since ci < sin 7. Thus 
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/ I X2 

2 t 


Figure 8. The disc Bc^s{z) and the rectangle R containing it 


we find that the length of the side is <45. 

Now it is clear from the construction that each vertical segment in R 
starting from the disc Bc^s{z) when continued downward and parallel to 
the X 2 -axis intersects the line joining 0 to P 2 , (which is a continuation 
of the side of the triangle). Moreover, if the length i of this side of the 
triangle exceeds the distance of P 2 from the origin, then the segment in- 
tersects the triangle. When this intersection occurs the segment starting 
from Bc 2 s{z) must also intersect the boundary of fl, since the triangle 
lies outside fl. Therefore if i >26 the desired intersection occurs, and 
each of the conclusions (1), (2), and (3) are verified. (We shall lift the 
restriction 5 < ij2 momentarily.) 

Now we integrate over each line segment parallel to the a; 2 -axis in R, 
including its portion in Bc^siz), which is continued downward until it 
meets dfl. Call such a segment /(xi). Then, using (30) we see that 



and an integration in xi gives 
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However, we note that Bc^s{z) C i?, and Bc^siz) D R when C 2 > 2. Thus 
the desired inequality (32) is established, still under the assumption that 
5 is small, that is, <5 < •^/2. When 6 > ^/2 it suffices merely to use the 
crude estimate (29) and the proposition is then proved. The proof of the 
theorem is therefore complete. 

5 Exercises 

1. Suppose / € L^(R‘*) and k € 

(a) Show that (/ * k){x) = J f{x — y)k{y) dy converges for a.e. x. 

(b) Prove that ||/ * fc||^ 2 (i{d) < ||/||L 2 (Rd) l|fc|Li(Kq • 

(c) Establish (/ * fc)(5) = fe(0/(0 for T 

(d) The operator Tf = f*k is a Fourier multiplier operator with multiplier 


” T -(0 = fc ( 0 ' 


[Hint: See Exercise 21 in Chapter 2.] 

2. Consider the Mellin transform defined initially for continuous functions / of 
compact support in = {t G R : t > 0} and a; £ R by 



Prove that {2'k)~^^’^A4 extends to a unitary operator from L^(R"'", dt/t) to L^(R). 
The Mellin transform serves on R^, with its multiplicative structure, the same 
purpose as the Fourier transform on R, with its additive structure. 

3. Let F{z) be a bounded holomorphic function in the half-plane. Show in two 
ways that limy^o F{x + iy) exists for a.e. x. 

(a) By using the fact that F{z)l{z -\- i) is in L/^(R+). 

(b) By noting that G{z) = F is a bounded holomorphic function in the 

unit disc, and using Exercise 17 in the previous chapter. 

4. Consider F(z) = l{z -|- t) in the upper half-plane. Note that F{x -\- iy) € 

I/^(R), for each y > 0 and y — 0. Observe also that F{z) ^ 0 as | 2 | ^ 0. However, 
F ^ Why? 

5. For a < b, let Sa,b denote the strip {z = x + iy, a < y < b}. Define H^{Sa,b) 
to consist of the holomorphic functions T in Sa.i, so that 
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Define H^{Sa,oo) and H^{S-oo,b) to be the obvious variants of the Hardy spaces 
for the half-planes {z = x + iy, y > a} and {z = x + iy, y < 6}, respectively. 

(a) Show that F £ H^{Sa,b) if and only if F can be written as 

Fiz)= [ 

Jr 

with dC < ^. 

(b) Prove that every F £ H^{Sa,b) can be decomposed as F = Gi + G 2 , where 
GsH^{Sa,oo) and G 2 £ H^{S-oo,b). 

(c) Show that lima<j/<6,i,^a F{x + iy) = Fa{x) exists in the L^-norm and also 
almost everywhere, with a similar result for \ima^ycb,y^b F{x + iy). 


6. Suppose D is an open set in C = R^, and let 74 be the subspace of L^{Q.) 

consisting of holomorphic functions on D. Show that 74 is a closed subspace of 

L^{Q,), and hence is a Hilbert space with inner product 

{f,g)= f{z)g{z)dxdy, where « = a: -|- fy. 

in 

[Hint: Prove that for f £ Ti, we have \f{z)\ < ll/ll for z £ n, where c = 

using the mean-value property (9). Thus if {fn} is a Cauchy sequence in 
74, it converges uniformly on compact subsets of D.] 

7 . Following up on the previous exercise, prove: 

(a) If is an orthonormal basis of 74, then 

go 2 

^ d(^) 

n=0 ' ’ ' 

(b) The sum 

00 

B{z,w) = ^ ip„{z)^{w) 

n=0 

converges absolutely for {z, w) £ Q x Q, and is independent of the choice of 
the orthonormal basis {ifn} of 74. 

(c) To prove (b) it is useful to characterize the function B{z,w), called the 
Bergman kernel, by the following property. Let T be the linear transfor- 
mation on L^{n) defined by 

Tf{z)= / B{z,w)f{w)dudv, w = u + iv. 
in 

Then T is the orthogonal projection of L^{Q,) to 74. 
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(d) Suppose that il is the unit disc. Then f £l~i exactly when f{z) = , 

with 

OO 

\a„\^(n + 1)“^ < OO. 

n=0 


Also, the sequence is an orthonormal basis of H. Moreover, 

in this case 


B{z, w) 


1 

7r(l — zw)'^ ' 


8. Continuing with Exercise 6, suppose Q is the upper half-plane R+. Then every 
f GTL has a representation 

POO 

(34) f{z) = v^ d^, «€»+, 

Jo 

where l/o(C)P^ < oo. Moreover, the mapping fo ^ f given by (34) is a uni- 
tary mapping from L^{{0, oo), ^) to H. 

9 . Let H be the Hilbert transform. Verify that 

(a) H* = —H, = —7, and H is unitary. 

(b) If Th denotes the translation operator, Th{f){x) = f{x — h), then H com- 
mutes with Th, ThH = HTh. 

(c) If 5a denotes the dilation operator, 5a{f){x) = f{ax) with a > 0, then H 
commutes with 5a, 5a H = H5a. 

A converse is given in Problem 5 below. 

10 . Let / G L^(R) and let u{x,y) be the Poisson integral of /, that is u = (/ * 
Vy){x), as given in (10) above. Let v{x, y) = {H f * Vy){x), the Poisson integral of 
the Hilbert transform of /. Prove that: 

(a) F{x + iy) = u{x, y) + iv(x, y) is analytic in the half-plane so that u and 
V are conjugate harmonic functions. We also have / = lim^^o u(x, y) and 
Hf = limy^ov{x,y). 

(b) F(z) = j-J^f(t)^. 

(c) v(x, y) = f * Qy, where Qy{x) = 4 is the conjugate Poisson kernel. 

[Hint: Note that ^ = Vy{x) + iQy{x), z = x + iy.] 

11 . Show that 

{^(s)"L 
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is an orthonormal basis of 

Note that ( , — r ( 1 is an orthonormal basis of see Exer- 

cise 9 in the previous chapter. 

[Hint: It suffices to show that if E C and 

[ '*~pn+i = 0 forn = 0, 1, 2 , . . 

J-oc (a;-t)"+i 

then F = 0. Use the Cauchy integral formula to prove that 

iFiz)iz + ir)U^. = 0, 

and thus = 0 for n = 0, 1, 2, . . ..] 

12 . We consider whether the inequality 

l|w||L2(n) < c||I/(M)||i2(n) 
can hold for open sets Q that are unbounded. 

(a) Assume d> 2. Show that for each constant coefficient partial differential 
operator L, there are unbounded connected open sets fl for which the above 
holds for all u G 

(b) Show that ||u||^ 2 (j{<i) < c||I/(u)||^ 2 (jid) for all u G C^{W^) if and only if 
l^’(C)l > c > 0 all where P is the characteristic polynomial of L. 

[Hint: For (a) consider first L = {d/dx{)^ and a strip {x : —1 < xi < 1}.] 

13 . Suppose L is a linear partial differential operator with constant coefficients. 
Show that when d> 2, the linear space of solutions u of L{u) = 0 with u G C°°(R‘*) 
is not finite-dimensional. 

[Hint: Consider the zeroes ^ of P{C), C ^ C'*, where P is the characteristic poly- 
nomial of L.] 


14 . Suppose F and G are two integrable functions on a bounded interval [a, b] . 
Show that G is the weak derivative of F if and only if F can be corrected on a set 
of measure 0, such that F is absolutely continuous and F'{x) = G{x) for almost 
every x. 

[Hint: If G is the weak derivative of F, use an approximation to show that 


f 


G{x)(p{x)dx = — F{x)(p' {x)dx 

J a 


holds for the function ip illustrated in Figure 9.[ 
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15. Suppose / € Prove that there exists g € L^(R‘^) such that 

(^) 

in the weak sense, if and only if 


16. Sobolev embedding theorem. Suppose n is the smallest integer > d/2. If 

/ e L^(K‘*) and 

in the weak sense, for all 1 < |a| < n, then / can be modified on a set of measure 
zero so that / is continuous and bounded. 

[Hint: Express / in terms of /, and show that / G by the Cauchy-Schwarz 

inequality.] 

17. The conclusion of the Sobolev embedding theorem fails when n = d/2. Con- 
sider the case d = 2 , and let f{x) = (log l/| 2 :|)“r 7 (a;), where is a smooth cut- 
off function with 77 = 1 for x near the origin, but rj{x) = 0 if |a;| > 1/2. Let 
0 < a < 1/2. 

(a) Verify that df /dxi and df /dx 2 are in in the weak sense. 

(b) Show that / cannot be corrected on a set of measure zero such that the 
resulting function is continuous at the origin. 




18. Consider the linear partial differential operator 
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Then 


p{0 = E 


|a I <71 


is called the characteristic polynomial of L. The differential operator L is said 
to be elliptic if 

|f’(OI — for some c > 0 and all ^ sufficiently large. 

(a) Check that L is elliptic if and only if na(27r^)“ vanishes only when 


^ = 0 . 


(b) If L is elliptic, prove that for some c > 0 the inequality 



holds for all (p € C“(fl) and |q:| < n. 

(c) Conversely, if (b) holds then L is elliptic. 


19 . Suppose u is harmonic in the punctured unit disc B* = {z £ C : 0 < | 2 :| < 1}. 

(a) Show that if u is also continuous at the origin, then u is harmonic throughout 
the unit disc. 

[Hint: Show that u is weakly harmonic.] 

(b) Prove that the Dirichlet problem for the punctured unit disc is in general 
not solvable. 


20. Let T be a continuous function on the closure D of the unit disc. Assume that 
F is in on the (open) disc B, and Jjj jVFp < cx3. 

Let /(e*®) denote the restriction of F to the unit circle, and write /(e“®) ~ 
= flne"®. Prove that E“=-cx> l«-| |OnP < OO. 

[Hint: Write F{re''^) ~ E“=-cx> with F„{1) = a„. Express in 

polar coordinates, and use the fact that 



for L > 2; apply this to F — Fn, L = |n|.] 
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6 Problems 

1. Suppose Fq{x) G Z/^(R). Then a necessary and sufficient condition that there 

exists an entire analytic function F, such that |-F(z)| < for all z G C, and 

Fo{x) = F{x) a.e. a: G R, is that Fo{S^) — 0 whenever |^| > a/2'K. 

[Hint: Consider the regularization F'^{z) = F{z — dt and apply to it 

the considerations in Theorem 3.3 of Chapter 4 in Book II.] 

2. Suppose H is an open bounded subset of R^. A boundary Lipschitz arc 7 is 
a portion of dQ. which after a rotation of the axes is represented as 

7 = {(a;i,X 2 ) : X 2 = r]{xi), a < x\ < &}, 

where a < b and 7 C cAI. It is also supposed that 

(35) ~ < M\xi — x-i\, whenever xi,x'i G [a, 6 ], 

and moreover if 75 = {(a;i,a: 2 ) : X 2 — 5 < 7 ( 11 ) < 2 : 2 }, then 75 n H = 0 for some 
(5 > 0. (Note that the condition (35) is satisfied if 7 G C^{[a, b]).) 

Suppose H satisfies the following condition. There are finitely many open discs 
Di, D 2 , . . . ,Dm with the property that Dj contains and for each j, dQ C Dj 
is a boundary Lipschitz arc (see Figure 10). Then Q verifies the outside-triangle 
condition of Theorem 4.12, guaranteeing the solvability of the boundary value 
problem. 



Figure 10. A domain with boundary Lipschitz arcs 


3.* Suppose the bounded domain Q has as its boundary a closed simple continuous 
curve. Then the boundary value problem is solvable for Q. This is because there 
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exists a conformal map $ of the unit disc D to that extends to a continuous 
bijection from B to fl. (See Section 1.3 and Problem 6* in Chapter 8 of Book II.) 

4. Consider the two domains Q in given by Figure 11. 



Domain / 

Figure 11. 



Domain II 
Domains with a cusp 


The set I has as its boundary a smooth curve, with the exception of an (inside) 
cusp. The set II is similar, except it has an outside cusp. Both I and II fall 
within the scope of the result of Problem 3, and hence the boundary value problem 
is solvable in each case. However, II satisfies the outside-triangle condition while 
I does not. 

5. Let T be a Fourier multiplier operator on L^(M‘*). That is, suppose there 
is a bounded function m such that (T f)(^) = m(5)/(^), all / £ Then T 

commutes with translations, t^T = Tt^, where Th{f){x) = f{x — h), for all h £ R"^. 

Conversely any bounded operator on I/^(R‘^) that commutes with translations 
is a Fourier multiplier operator. 

[Hint: It suffices to prove that if a bounded operator T commutes with multiplica- 
tion by exponentials h £ R®^, then there is an m so that Tg{^) = rn{^)g{^) 

for all g £ L^(R‘*). To do this, show first that 

f(%) = ^f{g), all g £ whenever 4- £ C“(R‘'). 

Next, for large N, choose so that it equals 1 in the ball j^j < N. Then m(^) = 
rW(C)for \^\<N.] 

As a consequence of this theorem show that if T is a bounded operator on (R) 
that commutes with translations and dilations (as in Exercise 9 above), then 

(a) If {T f){—x) = T{f{—x)) it follows T = cl, where c is an appropriate con- 
stant and I the identity operator. 

(b) If {T f){—x) = —T{f{—x)), then T = cH, where c is an appropriate constant 
and H the Hilbert transform. 


6. This problem provides an example of the contrast between analysis on L^(R‘*) 
and L^(R‘^). 


6. Problems 


261 


Recall that if / is locally integrable on the maximal function /* is defined 
by 

f*{x) = sup—^f \f{y)\dy, 
xeB m(n) J g 

where the supremum is taken over all balls containing the point x. 

Complete the following outline to prove that there exists a constant C so that 

11/ IL2(Il{d) < C'||/||L2(iid). 

In other words, the map that takes / to /* (although not linear) is bounded 
on This differs notably from the situation in as we observed in 

Chapter 3. 

(a) For each a > 0, prove that if / € L^(R‘^), then 

m{{x : f*{x)>a})<— [ \f{x)\dx. 

a J\f\>a/2 


Here, A = 3“^ will do. 

[Hint: Consider fi(x) = f{x) if |/(2:)| > a/2 and 0 otherwise. Check that 
fi e and 


{x : f{x) > a} C {a; : f*{x) > a/2}.] 


(b) Show that 



where Ea = {x ■. f*{x) > a}. 


(c) Prove that ||/*||i 2 (Rd) < C||/||i 2 (Rd). 



Abstract Measure and 
Integration Theory 


What immediately suggest itself, then, is that these 
characteristic properties themselves be treated as the 
main object of investigation, by defining and dealing 
with abstract objects which need satisfy no other con- 
ditions than those required by the very theory to be 
developed. 

This procedure has been made use of — more or 
less consciously — by mathematicians of every era. 
The geometry of Euclid and the literal algebra of the 
sixteenth and seventeenth centuries arose in this way. 
But only in more recent times has this method, called 
the axiomatic method, been consistently developed 
and carried through to its logical conclusion. 

It is our intention to treat the theories of measure 
and integration by means of the axiomatic method just 
described. 

C. Caratheodory, 1918 


In much of mathematics integration plays a significant role. It is used, 
in one form or another, when dealing with questions that arise in analysis 
on a variety of different spaces. While in some situations it suffices to 
integrate continuous or other simple functions on these spaces, the deeper 
study of a number of other problems requires integration based on the 
more refined ideas of measure theory. The development of these ideas, 
going beyond the setting of the Euclidean space is the goal of this 
chapter. 

The starting point is a fruitful insight of Caratheodory and the re- 
sulting theorems that lead to construction of measures in very general 
circumstances. Once this has been achieved, the deduction of the fun- 
damental facts about integration in the general context then follows a 
familiar path. 

We apply the abstract theory to obtain several useful results: the 
theory of product measures; the polar coordinate integration formula, 
which is a consequence of this; the construction of the Lebesgue-Stieltjes 
integral and its corresponding Borel measure on the real line; and the 
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general notion of absolute continuity. Finally, we treat some of the basic 
limit theorems of ergodic theory. This not only gives an illustration of 
the abstract framework we have established, but also provides a link with 
the differentiation theorems studied in Chapter 3. 

1 Abstract measure spaces 

A measure space consists of a set X equipped with two fundamental 
objects: 

(I) A cr-algebra M. of “measurable” sets, which is a non-empty col- 
lection of subsets of X closed under complements and countable 
unions and intersections. 

(II) A measure /x : Ad — > [0, oo] with the following defining property: 
if El, E 2 , . ■ ■ is a countable family of disjoint sets in Ad, then 





A measure space is therefore often denoted by the triple {X, Ad , /x) to em- 
phasize its three main components. Sometimes, however, when there is 
no ambiguity we will abbreviate this notation by referring to the measure 
space as (A, /x), or simply X. 

A feature that a measure space often enjoys is the property of being 
(T-fiuite. This means that X can be written as the union of countably 
many measurable sets of finite measure. 

At this early stage we give only two simple examples of measure spaces: 

(i) The first is the discrete example with X a countable set, X = 

{xn}'^=i, Ad the collection of all subsets of X, and the measure 
/X determined by ^{xn) = IJ-n, with {^n}'^=i a given sequence of 
(extended) non-negative numbers. Note that ^x(Fl) = 1^^- 

When /x„ = I for all n, we call /x the couutiug measure, and also 
denote it by In this case integration will amount to nothing but 
the summation of (absolutely) convergent series. 

(ii) Here X = Ad is the collection of Lebesgue measurable sets, and 
^i{E) = f dx, where / is a given non-negative measurable func- 
tion on The case / = 1 corresponds to the Lebesgue measure. 
The countable additivity of /x follows from the usual additivity and 
limiting properties of integrals of non-negative functions proved in 
Chapter 2. 
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The construction of measure spaces relevant for most applications require 
further ideas, and to these we now turn. 

1.1 Exterior measures and Caratheo dory’s theorem 

To begin the construction of a measure and its corresponding measurable 
sets in the general setting requires, as in the special case of Lebesgue mea- 
sure considered in Chapter 1, a prerequisite notion of “exterior” measure. 
This is defined as follows. 

Let X be a set. An exterior measure (or outer measure) /i* on 
X is a function /i* from the collection of all subsets of X to [0, oo] that 
satisfies the following properties: 

(i) ^*(0) = 0. 

(ii) If El C E2, then < //*(X 2 ). 

(in) If Xi, X 2 , . . . is a countable family of sets, then 



For instance, the exterior Lebesgue measure m* in defined in Chap- 
ter 1 enjoys all these properties. In fact, this example belongs to a 
large class of exterior measures that can be obtained using “coverings” 
by a family of special sets whose measures are taken as known. This 
idea is systematized by the notion of a “premeasure” taken up below in 
Section 1.3. A different type of example is the exterior a-dimensional 
Hausdorff measure m* defined in Chapter 7. 

Given an exterior measure /r* , the problem that one faces is how to de- 
fine the corresponding notion of measurable sets. In the case of Lebesgue 
measure in such sets were characterized by their difference from open 
(or closed) sets, when considered in terms of /x*. For the general case, 
Caratheodory found an ingenious substitute condition. It is as follows. 

A set X in X is Caratheodory measurable or simply measurable 
if one has 

(1) = fj.^,{E n A) + fj,^,{E^ n A) for every A C X. 

In other words, E separates any set A in two parts that behave well 
in regard to the exterior measure /x*. For this reason, (1) is sometimes 
referred to as the separation condition. One can show that in with the 
Lebesgue exterior measure the notion of measurability (1) is equivalent 
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to the definition of Lebesgue measurability given in Chapter 1 . (See 
Exercise 3 .) 

A first observation we make is that to prove a set E is measurable, it 
suffices to verify 

> /i*(£' n A) + n A) for all A C X, 

since the reverse inequality is automatically verified by the sub-additivity 
property (iii) of the exterior measure. We see immediately from the 
definition that sets of exterior measure zero are necessarily measurable. 

The remarkable fact about the definition ( 1 ) is summarized in the next 
theorem. 

Theorem 1.1 Given an exterior measure |J,^, on a set X, the eollection 
Xi of Caratheodory measurable sets forms a a-algebra. Moreover, /i* 
restricted to Xi is a measure. 

Proof. Clearly, 0 and X belong to Xi and the symmetry inherent 
in condition ( 1 ) shows that E'^ ^ Xi whenever E G Xi. Thus Xi is non- 
empty and closed under complements. 

Next, we prove that Xi is closed under finite unions of disjoint sets, 
and fj.^, is finitely additive on Xi. Indeed, if Ei, E2 & Xi, and A is any 
subset of X, then 

n A) -|- n A) 

= iXiftyEi n E2 n A) -|- fi^,[E‘[ n E2 n A)-\- 

+ r\E^GA) + yL,,{El GE^GA) 

> fj.„{{Ei U E2) n A) -|- fj.„{{Ei U E 2 )^ n A), 

where in the first two lines we have used the measurability condition 
on E2 and then Ei, and where the last inequality was obtained using 
the sub- additivity of and the fact that EiU E2 = {Ei D E2) A {E^ n 
E2) U {El n E2). Therefore, we have Ei U £'2 G Xi, and if Ei and £2 are 
disjoint, we find 

/r*(£i U £2) = /r* (£1 n (£1 U £2)) -f /i* C (£1 U £2)) 

= /r*(£i) -|- /x*(£2). 

Finally, it suffices to show that Xi is closed under countable unions of 
disjoint sets, and that /r* is countably additive on Xi. Let £i,£2,... 
denote a countable collection of disjoint sets in Xi, and define 

n 00 

Gn=\jEj and G=\Je^. 
j=i j=i 
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For each n, the set Gn is a finite union of sets in M, hence Gn ^ M. 
Moreover, for any A d X Yfe have 

r\A) = n {Gn H A)) + tJ,^{E^ n (Gn H A)) 

= lJ,if{En n A) + /t*(Gn-l n A) 

n 

= ^ ^ {Ej n A) , 

J=1 

where the last equality is obtained by induction. Since we know that 
Gn e Ad, and G'^ C GJ;, we find that 


/i*(A) — /t*(Gn n A) + n A) > ^ ^ ^.tf{Ej n A) + ^tf{G'^ n A). 

J=1 


Letting n tend to infinity, we obtain 

CXD 

/t* (A) > ^ ^ /t* (ifj n A) + /t* (G^ n A) > /t* (G n A) + /t* {G^ n A) 
i=i 

> ^Ji*{A). 

Therefore all the inequalities above are equalities, and we conclude that 
G G Ad, as desired. Moreover, by taking A = G in the above, we find 
that /It* is countably additive on Ad, and the proof of the theorem is 
complete. 

Our previous observation that sets of exterior measure 0 are Caratheodory 
measurable shows that the measure space (X, Ad , /x) in the theorem 
is complete: whenever F G Ad satisfies /x(F) = 0 and E C F, then 
EeM. 

1.2 Metric exterior measures 

If the underlying set X is endowed with a “distance function” or “met- 
ric,” there is a particular class of exterior measures that is of interest in 
practice. The importance of these exterior measures is that they induce 
measures on the natural cr-algebra generated by the open sets in X. 

A metric space is a set X equipped with a function d : X x X ^ 

[0, oo) that satisfies: 

(i) d{x, ?/) = 0 if and only if x = y. 

(ii) d{x,y) = d{y,x) for all x,y a X. 
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(iii) d{x, z) < d{x, y) + d{y, z), for all x,y,z & X. 

The last property is of course called the triangle inequality, and a func- 
tion d that satisfies all these conditions is called a metric on X. For 
example, the set with d{x,y) = \x — y\ is a metric space. Another 
example is provided by the space of continuous functions on a compact 
set K with d{f,g) = “ 9{x)\- 

A metric space (X, d) is naturally equipped with a family of open balls. 
Here 


Br{x) = {y e X : d{x, y) < r) 

defines the open ball of radius r centered at x. Together with this, we say 
that a set C X is open if for any x a O there exists r > 0 so that the 
open ball Br{x) is contained m O. X set is closed if its complement is 
open. With these definitions, one checks easily that an (arbitrary) union 
of open sets is open, and a similar intersection of closed sets is closed. 

Finally, on a metric space X we can define, as in Section 3 of Chapter 1, 
the Borel cr-algebra, Bx, that is the smallest cr-algebra of sets in X 
that contains the open sets of X. In other words Bx is the intersection 
of all cr-algebras that contain the open sets. Elements in Bx are called 
Borel sets. 

We now turn our attention to those exterior measures on X with the 
special property of being additive on sets that are “well separated.” We 
show that this property guarantees that this exterior measure defines a 
measure on the Borel cr-algebra. This is achieved by proving that all 
Borel sets are Caratheodory measurable. 

Given two sets A and B in a metric space (X, d ) , the distance between 
A and B is defined by 

d(A, B) = inf{d(a:, y) : x e A and y G B}. 

Then an exterior measure /x* on X is a metric exterior measure if it 
satisfies 


fj,^, {AU B) = {A) -I- /X* (B) whenever d{A, B) > 0. 

This property played a key role in the case of exterior Lebesgue measure. 

Theorem 1.2 ///i* is a metric exterior measure on a metric space X, 
then the Borel sets in X are measurable. Hence /x* restricted to Bx is a 


measure. 
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Proof. By the definition of Bx it suffices to prove that closed sets 
in X are Caratheodory measurable. Therefore, let F denote a closed set 
and A a subset of X with < oo. For each n > 0, let 

An = {x ^ A ■. d{x,F) >l/n). 

Then An C An+i-, and since F is closed we have F^r\A= U^i 
Also, the distance between F r\ A and An is > 1 /n, and since /i* is a 
metric exterior measure, we have 

(2) > m*((F n A) U An) = n A) + /t*(A„). 

Next, we claim that 

(3) lim ^*(A„) = /i*(F‘'n A). 

n— ^cxD 

To see this, let Bn = An+i H A‘f and note that 


d{Bn+l, An) > , I 1 • 

n[n + 1) 

Indeed, if j; € Bn+i and d{x, y) < l/n(?T. + 1) the triangle inequality shows 
that d{y,F) < 1/n, hence y ^ An- Therefore 

^J'*{A2k+l) > U A2k-l) = y*{B2k) + d‘*{A2k-l)i 

and this implies that 


k 

M*(A2fe+i) > 

i=i 


A similar argument also gives 


k 

y-*{A2k) > yy 

i=i 

Since ^*(A) is finite, we find that both series ^ y*{B 2 j) and ^ y^,{B 2 j-i) 
are convergent. Finally, we note that 


OO 

tJ-*{An) < y,^.{F^ n A) < y^{An) + y*{Bj), 

j=n+l 
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and this proves the limit (3). Letting n tend to infinity in the inequal- 
ity (2) we find that D A) + D A), and hence F is 

measurable, as was to be shown. 

Given a metric space X, a measure /i defined on the Borel sets of X 
will be referred to as a Borel measure. Borel measures that assign a 
finite measure to all balls (of finite radius) also satisfy a useful regularity 
property. The requirement that < oo for all balls B is satisfied in 
many (but not in all) circumstances that arise in practice.^ When it does 
hold, we get the following proposition. 

Proposition 1.3 Suppose the Borel measure fj, is finite on all balls in 
X of finite radius. Then for any Borel set E and any e > 0, there are 
an open set O and a elosed set F sueh that E C O and pL{0 — E) < e, 
while E C E and pl{E — E) < e. 

Proof. We need the following preliminary observation. Suppose 
F* = U^i ^k, where the F^ are closed sets. Then for any e > 0, we can 
find a closed set F C F* such that fJ.{F* — F) < e. To prove this we can 
assume that the sets {F^} are increasing. Fix a point xq G X, and let Bn 
denote the ball {x : d{x,xo) < n}, with Bq = {0}. Since = X, 

we have that 

F* = [jF*n(Bn-Bn-l). 

Now for each n, F* n {Bn — Bn-i) is the limit as /c — > oo of the increasing 
sequence of closed sets Fk n {Bn — Bn-i), so (recalling that Bn has finite 
measure) we can find an N = N{n) so that {F* — -pAf(n)) n {Bn — Bn-l) 
has measure less than e/2”. If we now let 

OO 

T = (Fjv(n) n {Bn - Bn-l)) , 

n=l 

it follows that the measure of F* — F is less that We 

also see that F n Bk is closed since it is the finite union of closed sets. 
Thus F itself is closed because, as is easily seen, any set F is closed 
whenever the sets F n Bk are closed for all k. 

Having established the observation, we call C the collection of all sets 
that satisfy the conclusions of the proposition. Notice first that if E 
belongs to C then automatically so does its complement. 


'^This restriction is not always valid for the Hausdorff measures that are considered in 
the next chapter. 
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Suppose now that E = U^i with each Ek G C. Then there are 
open sets Ok, Ok D Ek, with fi{Ok — Ek) < e/2^. However, if = 

I lOO /r\ /r\ 7-1 _ I lOO t ^r\ 7-t \ i //r^ t-in ^ /oA- 


Ur=i Ok, then O-E^ - Ek), and so n{0 - E) < , e/2>^ = 


e. 


Next, there are closed sets Ek C Ek with ^{Ek — Ek) < e/2^ . Thus if 
E* = U^i ^k, we see as before that fx{E — F*) < e. However, F* is not 
necessarily closed, so we can use our preliminary observation to find a 
closed set F C F* with /i(F* — F) < e. Thus fj,{E — F) < 2e. Since e is 
arbitrary, this proves that U^i belongs to C. 

Let us finally note that any open set is in C. The property regarding 
containment by open sets is immediate. To find a closed F C O, so 
that ^{O — F) < €, let Ek = {x & Bk ■ d{x, O^) > l/fc}. Then it is clear 
that each Ek is closed and O = IJ^i ^k- We then need only apply the 
observation again to find the required set F . Thus we have shown that C 
is a cr-algebra that contains the open sets, and hence all Borel sets. The 
proposition is therefore proved. 

1.3 The extension theorem 

As we have seen, a class of measurable sets on X can be constructed 
once we start with a given exterior measure. However, the definition of 
an exterior measure usually depends on a more primitive idea of measure 
defined on a simpler class of sets. This is the role of a premeasure defined 
below. As we will show, any premeasure can be extended to a measure 
on X. We begin with several definitions. 

Let X be a set. An algebra in X is a non-empty collection of subsets 
of X that is closed under complements, finite unions, and finite intersec- 
tions. Let A be an algebra in X. A premeasure on an algebra „4 is a 
function /iq : A ^ [0, oo] that satisfies: 


(i) /io(0) = 0. 


(ii) If Ei,E 2 ,... is a countable collection of disjoint sets in A with 


U^i Ek G A, then 



In particular, //q is finitely additive on A. 


Premeasures give rise to exterior measures in a natural way. 
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Lemma 1.4 If fxo is a premeasure on an algebra A, define /i* on any 
subset E of X by 

{ oo oo ^ 

fj,o{Ej) : E C Ej, where Ej G A for all j > . 

j=i j=i ) 

Then, /x* is an exterior measure on X that satisfies: 

(i) p,fiE) = plq{E) for all E £ A. 

(ii) All sets in A are measurable in the sense o/(l). 

Proof. Proving that /i* is an exterior measure presents no difficulty. 
To see why the restriction of to A coincides with suppose that 
E e A. Clearly, one always has ii*{E) < yio{E) since E covers itself. To 
prove the reverse inequality let E C U^i where Ej G A for all j. 
Then, if we set 



the sets E'^ are disjoint elements of A, E'j^ C E^ and E = UZiK- By 
(ii) in the definition of a premeasure, we have 


ho{E) — ^ ho{Ek). 


k=l 


k=l 


Therefore, we find that p,o{E) < pifiE), as desired. 

Finally, we must prove that sets in A are measurable for /x*. Let A 
be any subset of X, E ^ A, and e > 0. By definition, there exists a 
countable collection Ei,E 2 ,. ■ . of sets in A such that A C ur=i Ej and 


OO 


yy To{Ej) < t^*{A) + e. 


Since /xq is a premeasure, it is finitely additive on A and therefore 


OO 


OO 


OO 


yy To{Ej) — laoiE n Ej) + n Ej) 


> nfiEnA) + fj,fiE^r\A). 
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Since e is arbitrary, we conclude that n A) + n A), 

as desired. 

The cj-algebra generated by an algebra A is by definition the smallest 
cr-algebra that contains A. The above lemma then provides the necessary 
step for extending /to on A to a measure on the cr-algebra generated by 

A. 

Theorem 1.5 Suppose that A is an algebra of sets in X, /iq a premea- 
sure on A, and Ai the a-algebra generated by A. Then there exists a 
measure p, on Ai that extends po- 

One notes below that p is the only such extension of po under the as- 
sumption that p is cr-finite. 

Proof. The exterior measure /t* induced by po defines a measure p on 
the cr-algebra of Caratheodory measurable sets. Therefore, by the result 
in the previous lemma, p is also a measure on Ai that extends po- (We 
should observe that in general the class Ai is not as large as the class of 
all sets that are measurable in the sense of (1).) 

To prove that this extension is unique whenever p is cr-finite, we argue 
as follows. Suppose that u is another measure on Ai that coincides with 
Po on A, and suppose that F & Ai has finite measure. We claim that 
p{F) = v{F). li F Ej^ where Ej € A, then 

CXD OQ 

v{F) < = ^po{Ej), 

j=i j=i 

so that zc(F) < p{F). To prove the reverse inequality, note that ii E = 
IJ Ej , then the fact that n and p are two measures that agree on A gives 

n n 

I'iE) = lim 1/(1 J Ej) = lim p{[ J Ej) = p{E). 
j=l j=l 

If the sets Ej are chosen so that p{E) < p{E) -|- e, then the fact that 
p{F) < oo implies p{E — E) < e, and therefore 

p{F) < p{E) = v{E) = v{E) -h v{E -F)< v{E) -h p{E - F) 

< p{F) + e. 

Since e is arbitrary, we find that p{F) < v{F), as desired. 

Finally, we use this last result to prove that if p is cr-finite, then p = 
V. Indeed, we may write X = (jFlj, where Ei,E 2 ,... is a countable 
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collection of disjoint sets in A with n{Ej) < oo. Then for any F A4 we 
have 

n{F) = ^^i{Fn F,) = KF n Fj) = jy{F), 
and the uniqueness is proved. 

For later use we record the following observation about the premeasure 
Ho on the algebra A and the resulting measure h* that is implicit in the 
argument given above. The details of the proof may be left to the reader. 

We define Aa as the collection of sets that are countable unions of sets 
in A, and A^rS as the sets that arise as countable intersections of sets in 
A(j- 

Proposition 1.6 For any set F and any e > 0, there are sets Fi G 
Aa and F 2 G Ac^s, such that F C Fi, F C F 2 , and h*{Fi) < H*{F) + e, 
while h*{F 2 ) = h*{F). 

2 Integration on a measure space 

Once we have established the basic properties of a measure space X, the 
fundamental facts about measurable functions and integration of such 
functions on X can be deduced as in the case of the Lebesgue measure 
on R'*. Indeed, the results in Section 4 of Chapter 1 and all of Chapter 2 
go over to the general case, with proofs remaining almost word-for-word 
the same. For this reason we shall not repeat these arguments but limit 
ourselves to the bare statement of the main points. The reader should 
have no difficulty in filling in the missing details. 

To avoid unnecessary complications we will assume throughout that 
the measure space [X,M.,h) under consideration is cr- finite. 

Measurable functions 

A function f on X with values in the extended real numbers is measur- 
able if 


/ ^([— 00 , a)) = {x G X : f{x) < a} G A1 for all a G R. 

With this definition, the basic properties of measurable functions ob- 
tained in the case of R'^ with the Lebesgue measure continue to hold. 
(See Properties 3 through 6 for measurable functions in Chapter 1.) For 
instance, the collection of measurable functions is closed under the ba- 
sic algebraic manipulations. Also, the pointwise limits of measurable 
functions are measurable. 
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The notion of “almost everywhere” that we use now is with respect to 
the measure /t. For instance, if / and g are measurable functions on X, 
we write f = g a.e. to say that 

^^{{xe X : f{x) ^ g{x)}) = 0. 

A simple function on X takes the form 

N 

k=l 

where Ek are measurable sets of finite measure and ak are real numbers. 
Approximations by simple functions played an important role in the defi- 
nition of the Lebesgue integral. Fortunately, this result continues to hold 
in our abstract setting. 

• Suppose f is a non-negative measurable function on a measure 
space (X, A4,/i). Then there exists a sequence of simple functions 
that satisfies 

Tk{x) < ipk+i{x) and lim <pk{x) = f{x) for all x. 

fc— ^OO 

In general, if f is only measurable, there exists a sequence of simple 
functions {‘Pkj^^i that satisfies 

iTkix)] < \(fk+i{x)\ and lim (pk{x) = f{x) for all x. 

fc— »-oo 

The proof of this result can be obtained with some obvious minor 
modifications of the proofs of Theorems 4.1 and 4.2 in Chapter 1. Here, 
one makes use of the technical condition imposed on X, that of being a- 
finite. Indeed, if we write X = |J Fk, where Ffc G XI are of finite measure, 
then the sets Fk play the role of the cubes Qk in the proof of Theorem 4.1, 
Chapter 1. 

Another important result that generalizes immediately is Egorov’s the- 
orem. 


• Suppose is a sequence of measurable functions defined on 

a measurable set E C X with g,{E) < oo, and fk^f a-e- Then 
for each e > 0 there is a set with C E, g,{E — A^) < e, and 
such that fk^f uniformly on A^. 
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Definition and main properties of the integral 

The four-step approach to the construction of the Lebesgue integral that 
begins with its definition on simple functions given in Chapter 2 carries 
over to the situation of a ci-finite measure space {X, M,^). This leads 
to the notion of the integral, with respect to the measure //, of a non- 
negative measurable function f on X. This integral is denoted by 



which we sometimes simplify as f ^ f dfi, f f dfj, or f f, when no con- 
fusion is possible. Finally, we say that a measurable function / is inte- 
grable if 



The elementary properties of the integral, such as linearity and mono- 
tonicity, continue to hold in this general setting, as well as the following 
basic limit theorems. 

(i) Fatou’s lemma. If {fn} is a sequence of non-negative measurable 
functions on X, then 



(ii) Monotone convergence. If {fn} is a sequence of non-negative mea- 
surable functions with fn /" f, then 



(in) Dominated convergence. If {fn} is a sequence of measurable func- 
tions with fn^f a-e., and such that \fn\ < g for some integrable 
g, then 



and consequently 
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The spaces and //) 

The equivalence classes (modulo functions that vanish almost every- 
where) of integrable functions on {X, A4 , form a vector space equipped 
with a norm. This space is denoted by L^{X,^) and its norm is 

(4) ||/||li(x,p) = [ \f{x)\ dn{x). 

Jx 

Similarly we can define T^(X, /t) to be the equivalence class of measurable 
functions for which \f{x)\^d^{x) < oo. The norm is then 

( 5 ) \\f\\L2(^x,^.) = d^i{x)^ . 

There is also an inner product on this space given by 

{f,9)= [ f{x)g{x)d^i{x). 

JX 

The proofs of Proposition 2.1 and Theorem 2.2 in Chapter 2, as well as 
the results in Section 1 of Chapter 4, extend to this general case and 
give: 

• The space L^{X,^) is a complete normed vector space. 

• The space LP‘{X,pl) is a (possibly non-separable) Hilbert space. 

3 Examples 

We now discuss some useful examples of the general theory. 

3.1 Product measures and a general Fubini theorem 

Our first example concerns the construction of product measures, and 
leads to a general form of the theorem that expresses a multiple integral 
as a repeated integral, extending the case of Euclidean space considered 
in Section 3 of Chapter 2. 

Suppose (Xi,A4i,/xi) and (X 2 ,Xi 2 , ^ 2 ) are a pair of measure spaces. 
We want to describe the product measure fii x fi2 on the space X = 
Xi X X 2 = {{xi,X 2 ) '■ Xi (z Xi, X 2 G X 2 }. 

We will assume here that the two measure spaces are each complete 
and cj-finite. 

We begin by considering measurable rectangles: these are subsets 
of X of the form Ax B, with A and B measurable sets, that is, A G M.i 
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and B ^ M2 ■ We then let A denote the collection of all sets in X that are 
hnite unions of disjoint measurable rectangles. It is easy to check that A 
is an algebra of subsets of X. (Indeed, the complement of a measurable 
rectangle is the union of three disjoint such rectangles, while the union 
of two measurable rectangles is the disjoint union of at most six such 
rectangles.) From now on we abbreviate our terminology by referring to 
measurable rectangles simply as “rectangles.” 

On the rectangles we define the function /iq by fio{A x B) = fj,2{B) 

Now the fact that /io has a unique extension to the algebra A for which 
fxo becomes a premeasure is a consequence of the following fact: when- 
ever a rectangle A x B is the disjoint union of a countable collection of 
rectangles {Aj x Bj}, Ax B = U^i ^ then 

00 

( 6 ) ^iq{A X B) = y^^fj,o{Aj X Bj). 

i=i 

To prove this, observe that if xi G T, then for each X2 & B the point 
{xi,X2) belongs to exactly one Aj x Bj. Therefore we see that B is the 
disjoint union of the Bj for which xi G Aj. By the countable additivity 
property of the measure ^2 this has as an immediate consequence the 
fact that 


CXD 

XA{xi)fJ-2{B) = '^XAj{xi)^l2{Bj). 
i=i 

Hence integrating in xi and using the monotone convergence theorem we 
get hi{A)ij, 2{B) = tJ-i{Aj)tJ.2{Bj), which is ( 6 ). 

Now that we know that /Ug is a premeasure on A, we obtain from The- 
orem 1.5 a measure (which we denote by = /ii x ^2) on the cr-algebra 
M of sets generated by the algebra A of measurable rectangles. In this 
way, we have defined the product measure space {Xi x X2, AI, /ii x ^2)- 

Given a set X in A 1 we shall now consider slices 

Exi = {x2 G X2 : {xi,X2) G E} and = {xi G Xi : {xi,X2) G E} 

We recall the definitions according to which Aa denotes the collection 
of sets that are countable unions of elements of A, and AaS the sets 
that arise as countable intersections of sets from A^- We then have the 
following key fact. 
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Proposition 3.1 If E belongs to Aa-s, then is fj,i-measurable for 
every X2; moreover, is a H2 -measurable function. In addition 



Proof. One notes first that all the assertions hold immediately when 
ill is a (measurable) rectangle. Next suppose E is a set in Aa- Then we 
can decompose it as a countable union of disjoint rectangles Ej. (If the 
Ej are not already disjoint we only need to replace the Ej by IJfe<j — 
Ufe<j-i ^k-) Then for each X2 we have E^^ = U^i we observe 

that are disjoint sets. Thus by (7) applied to each rectangle Ej 

and the monotone convergence theorem we get our conclusion for each 
set E e Aa- 

Next assume E e Ac^s and that (/Ji x fJ.2){E) < 00. Then there is 
a sequence {Ej} of sets with Ej € Aa, Ejj^i C Ej, and E = fl^i 
We let fj{x2) = yii{E^^) and /(X2) = fj.i{E^^). To see that E^^ is /xi- 
measurable and f{x2) is well-defined, note that E^^ is the decreasing 
limit of the sets Ej'^, which we have seen by the above are measur- 
able. Moreover, since Ei G Aa and (pi x fj,2){Ei) < 00, we see that 
fj{x2) — > f{x2), as j — > 00 for each X2. Thus f{x2) is measurable. How- 
ever, {fj{x2)} is a decreasing sequence of non-negative functions, hence 


/ f{x2)dyL2{x) = lim / fj{x2)dfj,2{x) 

JX2 JX2 


and therefore (7) is proved in the case when (/xi x fJ,2){E) < 00. Now 
since we assumed both and ^2 are cr-finite, we can find sequences Ei C 
c ■ ■ ■ C T,- C ■ ■ ■ C Xi and Gi C G2 C ■ ■ ■ C Gj C ■ ■ ■ C X2, with 



Then we merely need to replace E hy Ej = E n {Ej x Gj), and let j — > 00 
to obtain the general result. 

We now extend the result in the above proposition to an arbitrary 
measurable set i? in Xi x X2, that is, E e M, the cr-algebra generated 
by the measurable rectangles. 

Proposition 3.2 If E is an arbitrary measurable set in X , then the 
conclusion of Proposition 3.1 are still valid except that we only assert that 
E^^ is pLi-measurable and p,i{E^^) is defined for almost every X2 G X2. 


Proof. Consider first the case when X is a set of measure zero. 
Then we know by Proposition 1.6 that there is a set F G AaS such that 


3. Examples 


279 


E C F and (/ii x fJ,2){F) = 0. Since C for every X 2 and F^^ has 
/ii-measure zero for almost every X2 by (7) applied to F, the assumed 
completeness of the measure fj .2 shows that E^^ is measurable and has 
measure zero for those X 2 - Thus the desired conclusion holds when E 
has measure zero. 

If we drop this assumption on E, we can invoke Proposition 1.6 again 
to find an F G AaS, F D E, such that F — E = Z has measure zero. 
Since F^^ — E^^ = we can apply the case we have just proved, and 
find that for almost all X 2 the set E^^ is measurable and fii{E^'^) = 
fj.i{F^^) — fj,i{Z^'^). From this the proposition follows. 

We now obtain the main result, generalizing Fubini’s theorem in Chap- 


ter 2. 


Theorem 3.3 In the setting above, suppose f{xi,X 2 ) is an integrable 
funetion on {Xi x X2,Hi x ^2)- 

(i) For almost every X 2 G X 2 , the sliee f^^{xi) = f{x\,X 2 ) is inte- 
grable on (Xi,^i). 

(ii) f{xi,X2)d^i is an integrable funetion on X2. 



Proof. Note that if the desired conclusions hold for finitely many 
functions, they also hold for their linear combinations. In particular it 
suffices to assume that / is non- negative. When f = Xe, where F is a set 
of finite measure, what we wish to prove is contained in Proposition 3.2. 
Hence the desired result also holds for simple functions. Therefore by 
the monotone convergence theorem it is established for all non-negative 
functions, and the theorem is proved. 

We remark that in general the product space {X,M,fi) constructed 
above is not complete. However, if we define the completed space {X, A4, fj.) 
as in Exercise 2, the theorem continues to hold in this completed space. 
The proof requires only a simple modification of the argument in Propo- 
sition 3.2. 

3.2 Integration formnla for polar coordinates 

The polar coordinates of a point x — { 0 } are the pair (r, 7), where 
0 < r < 00 and 7 belongs to the unit sphere S‘^~^ = {x G |a:| = 1}. 
These are determined by 

X 

(8) r = |x|, 7 = — r, and reciprocally by x = rj. 


X 
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Our intention here is to deal with the formula that, with appropriate 
definitions and under suitable hypotheses, states: 

(9) [ f{x) dx = 


Igd 


m: 


^dr\da{'y). 


For this we consider the following pair of measure spaces. First, 
(Xi, where Xi = (0,oo), Mi is the collection of Lebesgue mea- 

surable sets in (0,oo), and dfj.i{r) = r‘^~^dr in the sense that Hi{E) = 
dr. Next, X2 is the unit sphere and the measure fj,2 is 

the one in effect determined by (9) with fj.2 = cr. Indeed given any set 
E C 5"^“^ we let E = {x : x/\x\ & E, 0 < |x| < 1} be the “sector” 
in the unit ball whose “end-points” are in E. We shall say E G M2 
exactly when E is a Lebesgue measurable subset of and define 
fJ.2{E) = cr{E) = d ■ m{E), where m is Lebesgue measure in 

With this it is clear that both {Xi,Mi, Hi) and {X2,M2, ^^2) satisfy 
all the properties of complete and cr-finite measure spaces. We note also 
that the sphere has a metric on it given by ^(7,7') = I7 — 7'], for 
7,7' e If E is an open set (with respect to this metric) in 

then E is open in and hence X is a measurable set in 

Theorem 3.4 Suppose f is an integrable function on Then for al- 
most every ^ G 5"^“^ the slice f^ defined by f'^{r) = f{r^) is an integrable 
function with respect to the measure r'^~^ dr. Moreover, f^{r)r'^~^ dr 
is integrable on S'^~^ and the identity (9) holds. 


There is a corresponding result with the order of integration of r and 
7 reversed. 

Proof. We consider the product measure fi = p,i x 112 on Xi x X2 
given by Theorem 3.3. Since the space Xi x X2 = {(r, 7) : 0 < r < 
00 and 7 G can be identified with — {0}, we can think of fj. 

as a measure of the latter space, and our main task is to identify it with 
the (restriction of) Lebesgue measure on that space. We claim first that 


( 10 ) 


m{E) = p,{E) 


whenever X is a measurable rectangle E = Ei x E2, and in this case 
fa{E) = p,i{Ei) fj,2{E2) . In fact this holds for E2 an arbitrary measurable 
subset of 5"^“^ and Ei = (0, 1), because then E = Ei x E2 is the sector 
E2, while p,i{Ei) = 1 /d. 

Because of the relative dilation-invariance of Lebesgue measure, (10) 
also holds when E = (0, b) x E2, 6 > 0. A simple limiting argument then 
proves the result for sets Ei = (0,a], and by subtraction to all open 
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intervals Ei = {a,b), and thus for all open sets. Thus we have m{Ei x 
E 2 ) = tJ-i{Ei)fj. 2 {E 2 ) for all open sets Ei, and hence for all closed sets, 
and therefore for all Lebesgue measurable sets. (In fact, we can find 
sets El C El C Oi with Ei closed and Oi open, such that mi(Oi) — e < 
mi(Ei) < mi(Ei) + e, and apply the above to Ei x E 2 and Oi x £' 2 .) 
So we have established the identity (10) for all measurable rectangles 
and as a result for all finite unions of measurable rectangles. This is 
the algebra A that occurs in the proof of Theorem 3.3, and hence by 
the uniqueness in Theorem 1.5, the identity extends to the cr- algebra 
generated by A^ which is the cr-algebra M on which the measure fi is 
defined. To summarize, whenever E & A4, the assertion (9) holds for 
/ = Xe- 

To go further we note that any open set in — {0} can be written 
as a countable union of rectangles IJ^i ^ where Aj and Bj are 

open in (0,oo) and respectively. (This small technical point is 

taken up in Exercise 12.) It follows that any open set is in M., and 
therefore so is any Borel set. Thus (9) is valid for xe whenever E is 
any Borel set in R'* — {0}. The result then goes over to any Lebesgue 
set E' — {0}, since such a set can be written as a disjoint union 
E' = EU Z, where £ is a Borel set and Z C E, with £ a Borel set 
of measure zero. To finish the proof we follow the familiar steps of 
deducing (9) for simple functions, and then by monotonic convergence 
for non-negative integrable functions, and from that for the general case. 


3.3 Borel measures on R and the Lebesgue-Stieltjes integral 

The Stieltjes integral was introduced to provide a generalization of the 
Riemann integral f{x) dx, where the increments dx were replaced by 
the increments dE(x) for a given increasing function £ on [a, 6]. We wish 
to pursue this idea from the general point of view taken in this chapter. 
The question that is then raised is that of characterizing the measures 
on R that arise in this way, and in particular measures defined on the 
Borel sets on the real line. 

To have a unique correspondence between measures and increasing 
functions as we shall have below, we need first to normalize these func- 
tions appropriately. Recall that an increasing function £ can have at 
most a countable number of discontinuities. If xq is such a discontinuity, 
then 


lim F{x) = F{xq ) and 

X < XQ 
X ^ XQ 


lim F{x) = F{xq) 

X > XQ 
X XQ 
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both exist, while F{xq) < F{xq) and F{xo) is some value between F{xq) 
and F{xq). We shall now modify F at xq, if necessary, by setting 
F{xo) = F{xq), and we do this for every point of discontinuity. The 
function F so obtained is now still increasing, yet right-continuous at ev- 
ery point, and we say such functions are normalized. The main result 
is then as follows. 

Theorem 3.5 Let F be an inereasing function on M that is normalized. 
Then there is a unique measure /i (also denoted by dF ) on the Borel 
sets B on M. such that /t((a, 6]) = F{b) — F{a) if a <b. Conversely, if 
^ is a measure on B that is finite on bounded intervals, then F defined 
by F{x) = /t((0, x\), X > 0, F(0) = 0 and, F{x) = — /t((— x, 0]), x < 0, is 
inereasing and normalized. 

Before we come to the proof, we remark that the condition that yi be 
finite on bounded intervals is crucial. In fact, the Hausdorff measures 
that will be considered in the next chapter provide examples of Borel 
measures on M of a very different character from those treated in the 
theorem. 

Proof. We define a function on all subsets of M by 

OO 

n4E) = miY,{F{bj)-F{a,)), 
i=i 

where the infimum is taken over all coverings of E of the form {aj , bj] . 

It is easy to verify that /i* is an exterior measure on M. We observe 
next that /x*((a, b]) = {F{b) — F{a)), ifa < b. Clearly /x*((a, b]) < F{b) — 
F{a), since (a, 6], then covers itself. Next, suppose that 
covers (a, 6]; then it covers [a' ,b\ for any a < a' < b. However, by the 
right-continuity of T, if e > 0 is given, we can always choose b'- > bj such 
that F{b() < F{hj) -|- e/2-’ . Now the union of open intervals 

covers [o' ,b]. By the compactness of this interval, \^^^i{aj,h'-) covers 
[a' , b] for some N. Thus since F is increasing we have 

N N 

Fib) - F{a') < Y^Fib') - F{aj) < ^^(^(6,) - T(a,) + e/2’) 
j=i i=i 

< /e*((a, 6]) -h e. 

Thus letting a' — > a, and using the right-continuity of F again, we see 
that F{b) — F{a) < /i*((a, 6]) -|- e. Since e was arbitrary this then proves 
F(6)-F(a) = /i*((a,6]). 
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Next we show that /i* is a metric exterior measure (for the usual 
metric d{x, x') = jx — a;'| on the real line). Since /u* is an exterior measure 
we have VJ E 2 ) < n*{Ei) + thus it suffices to see that the 

reverse inequality holds whenever d{Ei,E 2 ) > S, for some 5 > 0. 

Suppose that we are given a positive e, and that is a 

covering of Ei U E 2 such that 

00 

- E{aj) < n*{Ei U E2) + e. 
i=i 

We may assume, after subdividing the intervals (oj, bj] into smaller half- 
open intervals, that each interval in the covering has length less than 5. 
When this is so each interval can intersect at most one of the two sets Ei 
or E 2 . If we denote by Ji and J 2 the sets of those indices for which (uj, 6j] 
intersects E\ and E 2 , respectively, then Ji n J 2 is empty; moreover, we 
have El C \Jj^j^{o,j,bj] as well as E2 C IJj^j^iaj^bj]. Therefore 

fi^{Ei) + M*(^2) < ^ E{bj) - F{aj) + ^ E{bj) - F{aj) 

jEJl jEJ2 

00 

< ^^Fibj) - F{aj) < U E2) + e. 

j=i 

Since e was arbitrary, we see that fx^,{Ei) + < ^J^*{El U E 2 ), as we 

intended to show. 

We can now invoke Theorem 1.5. This guarantees the existence of a 
measure fi for which the Borel sets are measurable; moreover, we have 
fj,{{a,b]) = F{b) — F{a), since clearly (a, 5]) is a Borel set and we have 
previously seen that /it*((a, 6]) = F{b) — F{a). 

To prove that fi is the unique Borel measure on M for which /i((a, 6]) = 
F{b) — F{a), let us suppose that z/ is another Borel measure with this 
property. It now suffices to show that v = fi on all Borel sets. 

We can write any open interval as a disjoint union (a, b) = 
by choosing {bj}^i to be a strictly increasing sequence with a < bj < b, 

— > 6 as j — > 00 , and taking oi = a, = bj. Since ly and ^ agree on 
each {aj,bj], it follows that i' and ^ agree on (a, 6), and hence on all 
open intervals, and therefore on all open sets. Moreover, clearly z/ and /i 
are finite on all bounded intervals; thus the regularity in Proposition 1.3 
allows one to conclude that u on all Borel sets. 

Conversely, if we start with a Borel measure /x on M that is finite on 
bounded intervals, we can define the function F as in the statement of the 
theorem. Then clearly F is increasing. To see that it is right-continuous. 
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note that if, for instance, xq > 0, the sets En = (0,xo + l/n] decrease 
to E = (0,a:o] as n — > oo, hence fJ.{En) — > IJ'{E), since ^jl{Ei) < oo. This 
means that E{xo + 1/n) — > E{xo). Since E is increasing, this implies 
that F is right-continuous at xq. The argument for any xq < 0 is similar, 
and thus the theorem is proved. 


Remarks. Several comments about the theorem are in order. 


(i) Two increasing functions F and G give the same measure if T — 
G is constant. The converse if also true because F{b) — F{a) = 
G{b) — G{a) for all a < 6 exactly when F — G \s constant. 

(ii) The measure constructed in the proof of the theorem is defined 
on a larger cr-algebra than the Borel sets, and is actually complete. 
However, in applications, its restriction to the Borel sets often suf- 
fices. 


(hi) If F is an increasing normalized function given on a closed interval 
[a, b], we can extend it to M by setting F{x) = F{a) for x < a, and 
F{x) = F{b) for X > b. For the resulting measure /z, the intervals 
(— oo,a] and {b,Qo) have measure zero. One then often writes 


f{x)dn{x)= / f{x)dF{x), 


for every / that is integrable with respect to /i. If F arises from an 
increasing function Fq defined on M, one may wish to account for 
the possible jump of Fq at a. In this case it is sometimes useful to 
define 



f{x)dF{x) 


as 


f{x)dno{x), 


where /ip is the measure on M corresponding to Fq. 

(iv) Note that the above definition of the Lebesgue-Stieltjes integral 
extends to the case when F is of bounded variation. Indeed suppose 
F is a complex- valued function on [a, b] such that F = 

where each Fj is increasing and normalized, and ej are ±1 or Fi. 
Then we can define f{x) dF{x) as fa 

we require that / be integrable with respect to the Borel measure 
/i = where fj-j is the measure corresponding to Fj. 

(v) The value of these integrals can be calculated more directly in the 
following cases. 
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(a) If F is an absolutely continuous function on [a, 6], then 


f{x)dF{x)= f f{x)F'{x) 


dx 


for every Borel measurable function / that is integrable with 
respect to /z = dF. 

(b) Suppose F is a pure jump function as in Section 3.3, Chap- 
ter 3, with jumps at the points Then when- 

ever / is, say, continuous and vanishes outside some finite 
interval we have 



n=l 


In particular, for the measure // we have ^{{xn}) = Oin and 
/x(F) = 0 for all sets that do not contain any of the Xn- 

(c) A special instance arises when F = H, the Heaviside function 
defined by H{x) = 1 for x > 0, and H{x) = 0 for x < 0. Then 



— OO 


which is another expression for the Dirac delta function arising 
in Section 2 of Chapter 3. 


Further details about (v) can be found in Exercise 11. 

4 Absolute continuity of measures 

The generalization of the notion of absolute continuity considered in 
Chapter 3 requires that we extend the ideas of a measure to encompass 
set functions that may be positive or negative. We describe this notion 
first. 

4.1 Signed measures 

Loosely speaking, a signed measure possesses all the properties of a mea- 
sure, except that it may take positive or negative values. More precisely, 
a signed measure z/ on a cr-algebra AI is a mapping that satisfies: 

(i) The set function i' is extended-valued in the sense that — oo < 


i^(F) < oo for all F e AI. 
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(ii) If are disjoint subsets of M, then 



Note that for this to hold the sum '^i^iEj) must be independent of 
the rearrangements of terms, so that if i^(Uj=i ^j) i® finite, it implies 
that the sum converges absolutely. 

Examples of signed measures arise naturally if we drop the assumption 
that / be non-negative in the expression 

J E 

where (X, is a measure space and / is /i-measurable. In fact, 
to ensure that v satisfies (i) and (ii) the function / is required to be 
“integrable” with respect to ^ in the extended sense that f f~ d/i must 
be finite, while f d/i may be infinite. 

Given a signed measure v on (X,Ai) it is always possible to find a 
(positive) measure /i that dominates u, in the sense that 

iy{E) < ^i{E) for all E, 

and that in addition is the “smallest” /z that has this property. 

The construction is in effect an abstract version of the decomposition 
of a function of bounded variation as the difference of two increasing 
functions, as carried out in Chapter 3. We proceed as follows. We define 
a function |z/| on Ai, called the total variation of v, by 


\iy\{E) = sup '^\iy{Ej)\, 
j=i 

where the supremum is taken over all partitions of E, that is, over all 
countable unions E = U^i where the sets Ej are disjoint and belong 
to M. 

The fact that \v\ is actually additive is not obvious, and is given in the 
proof below. 

Proposition 4.1 The total variation \v\ of a signed measure v is itself 
a (positive) measure that satisfies v < \v\. 
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Proof. Suppose is a countable collection of disjoints sets in 

M, and let i? = U Ej. It suffices to prove: 

(11) E \^\{Ej) < \v\{E) and \t^\{E) < EM(i^j)- 

Let Uj be a real number that satisfies aj < \v\{Ej). By definition, each 
Ej can be written as Ej = jj^ Eij, where the Eij are disjoint, belong to 
M, and 

CXD 

aj < 

2=1 


Since E = \^- ^ we have 

< W\{E). 

Consequently, taking the supremum over the numbers aj gives the first 
inequality in (11). 

For the reverse inequality, let Ek be any other partition of E. For a 
fixed k, {Ffe n Ej}j is a partition of Fk, so 




'^lyiFh n Ej) , 

j 


since z/ is a signed measure. An application of the triangle inequality and 
the fact that {Fk n Ej}k is a partition of Ej gives 


Y,HFk)\<Y,E\^iPknE,)\ 

k k j 

j k 

j 

Since {F^} was an arbitrary partition of E, we obtain the second in- 
equality in (11) and the proof is complete. 

It is now possible to write v as the difference of two (positive) measures. 
To see this, we define the positive variation and negative variation 
of V by 

= + and iy~ = ^{\iy\-iy). 
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By the proposition we see that and v are measures, and they clearly 
satisfy 

V = — v~ and \i'\ = + i'~ . 

In the above if i'{E) = oo for a set E, then \v\{E) = oo, and i'~{E) is 
defined to be zero. 

We also make the following definition: we say that the signed measure 
V is (7- finite if the measure |z^| is cr- finite. Since v <\v\ and | — i'\ = |i^|, 
we find that 


— \v\ < v < \v\- 

As a result, if v is ci-finite, then so are and v~ . 

4.2 Absolute continuity 

Given two measures defined on a common ci-algebra we describe here the 
relationships that can exist between them. More concretely, consider two 
measures v and defined on the a-algebra At; two extreme scenarios 
are 

(a) V and are “supported” on separate parts of M. 

(b) The support of v is an essential part of the support of p. 

Here we adopt the terminology that the measure u is supported on a 
set A, if v{E) = v{E n A) for all G Af . 

The Lebesgue-Radon-Nikodym theorem below states that in a precise 
sense the relationship between any two measures v and is a combination 
of the above two possibilities. 

Mutually siugular aud absolutely coutiuuous uieasures 

Two signed measures u and /z on (X, Al) are uiutually siugular if there 
are disjoint subsets A and H in Al so that 

i'{E) = v{A n E) and ^J^{E) = ^{B n E) for all X € Al. 

Thus V and // are supported on disjoint subsets. We use the symbol 
V E ^ to denote the fact that the measures are mutually singular. 

In contrast, if z/ is a signed measure and fi a (positive) measure on Al, 
we say that u is absolutely coutiuuous with respect to ^ if 


( 12 ) 


^{E) = 0 whenever E ^ M and ^.{E) = 0. 
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Thus if V is supported in a set A, then A must be an essential part of the 
support of fj, in the sense that /i(A) > 0. We use the symbol v ^ io 
indicate that v is absolutely continuous with respect to //. Note that if 
V and /X are mutually singular, and v is also absolutely continuous with 
respect to /u, then v vanishes identically. 

An important example is given by integration with respect to /x. In- 
deed, if / e or if / is merely integrable in the extended sense 

(where f f~ < oo, but possibly J f~^ = oo), then the signed measure zx 
defined by 



(13) 


is absolutely continuous with respect to /x. We shall use the shorthand 
du = fd^ to indicate that v is defined by (13). 

This is a variant of the notion of absolute continuity that arose in 
Chapter 3 in the special case of M (with M the Lebesgue measurable 
sets and d/x = dx the Lebesgue measure). In fact, with ix defined by (13) 
and / an integrable function, we saw that in place of (12) we had the 
following stronger assertion: 


(14) 


For each e > 0, there is a 6 > 0 such that ia{E) < 5 implies < e. 

In the general situation the relation between the two conditions (12) 
and (14) is clarified by the following observation. 

Proposition 4.2 The assertion (14) implies (12). Conversely, if |zx| is 
a finite measure, then (12) implies (14). 

That (12) is a consequence of (14) is obvious because fJ-{E) = 0 gives 
\r'{E)\ < e for every e > 0. To prove the converse, it suffices to consider 
the case when i/ is positive, upon replacing zx by |zx|. We then assume 
that (14) does not hold. This means that it fails for some fixed e > 
0. Hence for each n, there is a measurable set En with /x(i?„) < 2“"' 
while u{En) > e. Now let E* = limsup^^o^^ En = Pl^i where Ef = 
Ufe>n Then since /x(£'*) < Y,k>n V2^ = l/2"-“\ and the decreasing 
sets {El} are contained in a set of finite measure {E}), we get fJ-{E*) = 0. 
However v{El) > v{En) > e, and the v measure is assumed finite. So 
v{E*) = lim„^oo i^{El) > e, which gives a contradiction. 

After these preliminaries we can come to the main result. It guarantees 
among other things a converse to the representation (13); it was proved 
in the case of M by Lebesgue, and in the general case by Radon and 
Nikodym. 
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Theorem 4.3 Suppose p, is a a-finite positive measure on the measure 
space {X,Ai) and v a a-finite signed measure on A4. Then there exist 
unique signed measures Va and Vg on Ai such that Va ^ p, Vg T p and 
V = Va + Vg. In addition, the measure Va takes the form di'a = fdp; that 
is, 

Va{E)= [ f{x)dp{x) 

Je 

for some extended p-integrable function f. 

Note the following consequence. If v is absolutely continuous with respect 
to p, then dix = fdp, and this assertion can be viewed as a generalization 
of Theorem 3.11 in Chapter 3. 

There are several known proofs of the above theorem. The argument 
given below, due to von Neumann, has the virtue that it exploits elegantly 
the application of a simple Hilbert space idea. 

We start with the case when both i/ and p are positive and finite. Let 
p = V p, and consider the transformation on Lfi{X,p) defined by 

^{4’)= [ fi’{x)dv{x). 

Jx 

The mapping £ defines a bounded linear functional on {X, p) since 

\I{f’)\< [ \if{x)\dv{x) < [ \fi{x)\dp{x) 

Jx Jx 

< {p{X))^/^ (^J^\fi{x)\^ dp{x)^ , 

where the last inequality follows by the Cauchy-Schwarz inequality. But 
Lfi{X, p) is a Hilbert space, so the Riesz representation theorem (in Chap- 
ter 4) guarantees the existence oi g & If (X, p) such that 

(15) / 'ijj{x) dv{x) = I fi{x)g{x) dp{x) ior aH if ^ If {X , p) . 

Jx Jx 

li E e M with p{E) > 0, when we set V' = Xb in (15) and recall that 
1 / < p, we find 

0 < j g{x)dp{x) < 1, 

from which we conclude that 0 < g{x) < 1 for a.e. x (with respect to the 
measure p). In fact, 0 < g{x) dp{x) for all sets E e Ai implies that 
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g{x) > 0 almost everywhere. In the same way, 0 < /^(l — g{x))dp{x) 
for all i? G guarantees that g{x) < 1 almost everywhere. Therefore 
we may clearly assume 0 < (/(x) < 1 for all x without disturbing the 
identity (15), which we rewrite as 


(16) J ipil - g) di' 

Consider now the two sets 


ipgdp. 


A= {x G X : 0 < g{x) < 1} and B = {x G X : g{x) = 1}, 
and define two measures Va and Vg on Xi by 

Va{E) = v{Ar\ E) and Vs{E) = v{B C] E). 

To see why T /x, it suffices to note that setting x/i = in (16) gives 

0 = y XBdg = p{B). 

Finally, we set = Xe{^ + g + ■ ■ ■ + <?") in (16) : 


( 17 ) / {l-g-^^)dv= f g{l + ... + g-)dp. 

J E J E 

Since (1 — g'”+^)(j;) = 0 if x G B, and (1 — 5 "'^^)(x) I if x & A, the 
dominated convergence theorem implies that the left-hand side of ( 17 ) 
converges to iy{A n E) = Va{E). Also, I + g +■■■+ g'^ converges to 
so we find in the limit that 

Va{E) = j fdp, where f = 

Note that / G L^{X, g), since Va{X) < v{X) < oo. If g and zx are a-finite 
and positive we may clearly find sets Ej G M such that X = \^Ej and 

g{Ej) < oo, E{Ej) < oo for all j. 

We may define positive and finite measures on Xi by 

gj{E) = g{E n Ej) and Ej{E) = u{E r\Ej), 

and then we can write for each j, Vj = Vj a + Fj,s where Uj g T gj and 
Vj^a = fj dgj. Then it suffices to set 

f ='^fj, = and Va = ^ Vj^a- 
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Finally, if v is signed, then we apply the argument separately to the 
positive and negative variations of v. 

To prove the uniqueness of the decomposition, suppose we also have 
z/ = + z/', where ^ and z/' ± /i. Then 



The left-hand side is absolutely continuous with respect to and the 
right-hand side is singular with respect to fi. Thus both sides are zero 
and the theorem is proved. 

5* Ergodic theorems 

Ergodic theory had its beginnings in certain problems in statistical me- 
chanics studied in the late nineteenth century. Since then it has grown 
rapidly and has gained wide influence in a number of mathematical disci- 
plines, in particular those related to dynamical systems and probability 
theory. It is not our purpose to try to give an account of this broad 
and fascinating theory. Rather, we restrict our presentation to some of 
the basic limit theorems that lie at its foundation. These theorems are 
most naturally formulated in the general context of abstract measure 
spaces, and thus for us they serve as excellent illustrations of the general 
framework developed in this chapter. 

The setting for the theory is a cr- finite measure space en- 

dowed with a mapping t : X ^ X such that whenever E is a measurable 
subset of X, then so is and = ^{E). Here t~^{E) is 

the pre-image of E under r; that is, t~^{E) = {x e X : t{x) € E}. A 
mapping t with these properties is called a measure-preserving trans- 
formation. If in addition for such a r we have the feature that it is a 
bijection and is also a measure- preserving transformation, then r is 
referred to as a measure-preserving isomorphism. 

Let us note that if r is a measure-preserving transformation, then 
f{T{x)) is measurable if f{x) is measurable, and is integrable if / is 
integrable; moreover, then 



(18) 


Indeed, if xe is the characteristic function of the set E, we note that 
Xe{e{x)) = Xt-^(e){^)j so the assertion holds for characteristic func- 
tions of measurable sets and thus for simple functions, and hence by the 
usual limiting arguments for all non-negative measurable functions, and 
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then integrable functions. For later purposes we record here an equiva- 
lent statement: whenever / is a real- valued measurable function and a 
is any real number, then 

H{{x : f{x) > a}) = : f{T{x)) > a}). 

Before we proceed further, we describe several examples of measure- 
preserving transformations: 

(i) Here X = Z, the integers, with n its counting measure; that is, 
/r(X) = #(£') = the number of integers in E, for any X C Z. We 
define r to be the unit translation, t : n i-)- n + 1. Note that t gives 
a measure-preserving isomorphism of Z. 

(ii) Another easy example is X = with Lebesgue measure, and r a 
translation, t : x x + h for some fixed h G This is of course 
a measure-preserving isomorphism. (See the section on invariance 
properties of the Lebesgue measure in Chapter 1.) 

(iii) Here X is the unit circle, given as M/Z, with the measure induced 
from Lebesgue measure on R. That is, we may realize X as the unit 
interval (0,1], and take fi to be the Lebesgue measure restricted 
to this interval. For any real number a, the translation x x + 
a, taken modulo Z, is well defined on X = M/Z, and is measure- 
preserving. (See the related Exercise 3 in Chapter 2.) It can be 
interpreted as a rotation of the circle by angle 27ra. 

(iv) In this example X is again (0, 1] with Lebesgue measure but t 
is the doubling map t{x) = 2x mod 1. It is easy to verify that 
T is a measure-preserving transformation. Indeed, any set E C 
(0,1] has two pre-images Ei and E 2 , the first in (0,1/2] and the 
second in (1/2,1], both of measure fj.{E)/2, if E is measurable. 
(See Figure 1.) However, r is not an isomorphism, since r is not 
injective. 

(v) A trickier example is given by the transformation that is key in 
the theory of continued fractions. Here X = [0, 1) and r is defined 
by t{x) = (1/x), the fractional part of 1/x; when a; = 0 we set 
t(0) = 0. Gauss observed, in effect, that the measure dfj, = dx 
is preserved by the transformation r. Note that each x G (0, 1) has 
infinitely many pre- images under r; that is, the sequence {1 /(j: -|- 

More about this example can be found in Problems 8 
through 10 below. 
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Figure 1. Pre-images Ei and E 2 under the doubling map 


Having pointed out these examples, we can now return to the general 
theory. The notions described above are of interest, in part, because they 
abstract the idea of a dynamical system, one whose totality of states is 
represented by the space X, with each point x a X giving a particular 
state of the system. The mapping t : X ^ X then describes the trans- 
formation of the system after a unit of time has elapsed. For such a 
system there is often associated a notion of “volume” or “mass” that is 
unchanged by the evolution, and this is the role of the invariant measure 
/i. The iterates, t"' = toto---ot (n times) describe the evolution of 
the system after n units of time, and a principal concern is the average 
behaviour, as n — > 00 , of various quantities associated with the system. 
Thus one is led to study averages 


77 . — 1 



(19) 


k=0 


and their limits as n — > 00 . To this we now turn. 

5.1 Mean ergodic theorem 

The first theorem dealing with the averages (19) that we consider is 
purely Hilbert-space in character. Historically it preceded both Theo- 
rems 5.3 and 5.4 which will be proved below. 

For the specific application of the theorem below, one takes the Hilbert 
space H to be LF‘{X,Xi, ^). Given the measure-preserving transforma- 
tion T on X, we define the linear operator T on by 


( 20 ) 


= f{T{x)). 


Then T is an isometry; that is. 


( 21 ) 


\\Tf\\ = ll/ll 
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where || ■ || denotes the Hilbert space (that is, the norm. This is clear 
from (18) with / replaced by |/p. Observe that if r were also supposed 
to be a measure-preserving isomorphism, then T would be invertible and 
hence unitary; but we do not assume this. 

Now with T as above, consider the subspace S of invariant vec- 
tors, S = {f e H : T{f) = /}. Clearly, because of (21), the subspace 
S is closed. Let P denote the orthogonal projection on this subspace. 
The theorem that follows deals with the “mean” convergence, meaning 
convergence in the norm. 

Theorem 5.1 Suppose T is an isometry of the Hilbert space hi, and let 
P be the orthogonal projection on the subspace of the invariant vectors of 
T. Let An = i(/ + r-hr2-h--- + T"-i). Then for each f &n, A„(/) 
converges to P{f) in norm, as n — > oo. 

Together with the subspace S defined above we consider the subspaces 
S'* = {/ e W : T*{f) = /} and Si = {/ e W : f = g - Tg, g e H}; here 
T* denotes the adjoint of T. Then S*, like S, is closed, but Si is not 
necessarily closed. We denote its closure by Si. The proof of the theorem 
is based on the following lemma. 

Lemma 5.2 The following relations hold among the subspaces S, S*, 
and Si. 

(i) S= S*. 

(ii) The orthogonal complement of Si is S. 

Proof. First, since T is an isometry, we have that {Tf,Tg) = {f,g) 
for all f,g gTI, and thus T*T = I. (See Exercise 22 in Chapter 4.) So 
if T/ = / then T*Tf = T* f, which means that f = T* f. To prove the 
converse inclusion, assume T*f = /. As a consequence (/, T* f — /) = 0, 
and thus {f,T*f) - (/, /) = 0; that is, (T/,/) = ||/p. However, ||r/|| = 
II /II, so we have in the above an instance of equality for the Cauchy- 
Schwarz inequality. As a result of Exercise 2 in Chapter 4 we get T f = 
cf, which by the above gives Tf = f. Thus part (i) is proved. 

Next we observe that / is in the orthogonal complement of Si ex- 
actly when {f,g — Tg) = 0, for all g dH. However, this means that 
(/ — T*f,g) = 0 for all g, and hence f = T*f, which by part (i) means 
/e5. 

Having established the lemma we can finish the proof of the theorem. 
Given any / G we write / = /o + /i, where /o G S' and /i G Si (since 
S and Si are orthogonal complements). We also fix e > 0 and pick f[ G 
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such that II /i — /{|| < e. We then write 
(22) -^nif) = ^n(/o) + ^n(/l) + ^n(/l “ f[), 

and consider each term separately. 

For the first term, we recall that P is the orthogonal projection on S, 
so P{f) = fo, and since T/g = /o we deduce 

1 

Anifo) = - Y] T^ifo) = fo = P{f) for every n > 1. 
fe =0 

For the second term, we recall the definition of Si and pick a, g (zTi 
with f[ = g — Tg. Thus 

n— 1 n—1 

Mf'i) = - E - T){g) = -Y,T\g) - 

k=0 k=0 

n 

Since T is an isometry, the above identity shows that A„(/() converges 
to 0 in the norm as n — > oo. 

For the last term, we use once again the fact that each T* is an isometry 
to obtain 


1 

UMi - /Oil < - E - /Oil < ll/i - /^ll < 

k=0 

Finally, from (22) and the above three observations, we deduce that 
hmsup„^o (3 \\An{f) — T’(/)|| < e, and this concludes the proof of the the- 
orem. 

5.2 Maximal ergodic theorem 

We now turn to the question of almost everywhere convergence of the 
averages (19). As in the case of the averages that occur in the differ- 
entiation theorems of Chapter 3, the key to dealing with such pointwise 
limits lies in estimates for their corresponding maximal functions. In the 
present case this function is defined by 

m— 1 

f*{x) = sup — ^ |/(t*(x))|. 

l<m<cxD ^ i.— n 


( 23 ) 
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Theorem 5.3 Whenever f G the maximal function f*{x) is 

finite for almost every x. Moreover, there is a universal constant A so 
that 

(24) /*(x) > a}) < ^ ||/||li(x,;,) for all a >0. 

There are several proofs of this theorem. The one we choose emphasizes 
the close connection to the maximal function given in Section 1.1 of 
Chapter 3, and we shall in fact deduce the present theorem from the 
one-dimensional case of that chapter. This argument gives the value 
A = 6 for the constant in (24). By a different argument one can obtain 
A = 1, but this improvement is not relevant in what follows. 

Before beginning the proof, we make some preliminary remarks. Note 
that in the present case the function f* is automatically measurable, 
since it is the supremum of a countable number of measurable functions. 
Also, we may assume that our function / is non-negative, since otherwise 
we may replace it by |/|. 

Step 1. The case when X = Z and r : n i— > n -|- 1. 

For each function f on Z, we consider its extension / to M defined by 
f{x) = f{ri) for n < X < n -h 1, n € Z. (See Figure 2.) 


f{n) 


f{x) 


-1 n = 0 1 2 



2 


Figure 2. Extension of / to R 


Similarly, if E C Z, denote by E the set in R given hy E = IJnes + 
1). Note that as a result of these definitions we have m{E) = ff{E) and 
J^f{x)dx = and thus ||/||li(k) = Wfh^z)- Here m is the 

Lebesgue measure on R, and ff is the counting measure on Z. Note also 
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that 

m-1 

f{n + k)= / f{n + t)dt. 

Jo 

However, because f{n + t)dt< f{x + t) dt whenever a; G [n,n + 
1), we see that 

1 f 4 - 1 \ 1 

— f{n + A:) < ( ) / f{x + t)dt if a: G [n, n + 1). 

m ^ y m J m + 1 J 

Taking the supremum over all m > 1 in the above and noting that (m + 
l)/m < 2, we obtain 

(25) /*(^) < 2(/)*(x) whenever a; G [n, n+ 1). 

To be clear about the notation here: f*{n) denotes the maximal function 
of / on Z defined by (23), with f{T^{n)) = f{n + k), while (/)* is the 
maximal function as defined in Chapter 3, of the extended function / 
on M. 

By (25) 

#({n : /*(n) > a}) < m({a: G M : {/)* {x) > a/2}), 

and thus the latter is majorized by A' /{a/2) f f{x) dx = 2A'/a||/||ii(K), 
according to the maximal theorem for R. The constant A' that occurs in 
that theorem (there denoted by A) can be taken to be 3. Hence we have 

(26) #({n : f*{n) > a}) < - ||/||li(z), 

a ’ 

since ||/||li(r) = ||/||li(z)- This disposes of the special case when X = Z. 
Ste'p 2. The general case. 

By a sleight-of-hand we shall “transfer” the result for Z just proved to 
the general case. We proceed as follows. 

For every positive integer N , we consider the truncated maximal func- 
tion /)(r defined as 



m— 1 


r^{x) = sup /( 

i<m<A rn 


(x)). 


k=0 
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Since forms an increasing sequence with N, and limw^oo f^i^) ~ 

f*{x) for every x, it suffices to show that 

(27) tJ.{x : f^{x) > a} < ^ ||/|| li ( x , a .), 

with constant A independent of N. Letting N ^ oo will then give the 
desired result. 

So in place of f* we estimate and to simplify our notation we write 
the latter as /*, dropping the N subscript. Our argument will compare 
the maximal function f* with the special case arising for Z. To clarify 
the formula below we temporarily adopt the expedient of denoting the 
second maximal function by A4{f). Thus for a positive function f on Z 
we set 

m — 1 

-^(/)(’^) = sup — V/(n+fc). 

Km m 
— k=0 


Now starting with a function f on X that is integrable, we define the 
function F on X x Z by 


Then 


F{x, n) 


/(r”(j:)) if n > 0, 

0 if n < 0. 


m— 1 


m— 1 


AmU){x) = — ^ ^ F[x,k). 


k=0 


k=0 


In the above we replace x by t^(x); then since t^{t^{x)) = we 

have 

m— 1 

Am{f){j^[x)) = — ^ F{x, n + k). 

k=0 


Now we fix a large positive a and set b = a + N. We also write Ff, for 
the truncated function on X x Z defined by F},{x,n) = F{x,n) if n < 6, 
Fh{x,n) = 0 otherwise. We then have 

^ m— 1 

^m(/)('r^(x)) = — n-\- k) if m < N and n < a. 

/c=0 


Thus 

(28) 


f*{T'^{x)) < M{Fb){x,n) if n < a. 
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(Recall that /* is actually /a!) This is the comparison of the two maxi- 
mal functions we wished to obtain. NowsetillQ, = {x : f*{x) > a}. Then 
by the measure-preserving character of r, ^{{x : /*(t”(j;)) > a}) = 
fj.{Ea). Hence on the product space X x Z the product measure x # 
of the set {{x,n) G X x Z : /*(T"'(a;)) > a, 0 < n < a} equals a^{Ea). 
However, because of (28) the fJ, x ^ measure of this set is no more than 


#({nez 


M{Ef,){x, n) > a}) dfi. 


J X 

Because of the maximal estimate (26) for Z, we see that the integrand 
above is no more than 


A 

a 


\\Eb{x,n)\\L^(i,) 


A 


a 


b-l 




n=0 


with of course A = 6. 

Hence, integrating this over X and recalling that f{T^{x)) dfj, = 
fx gives us 


afx{Ea) < —b ||/||li(a:) — — {a + N) ||/||li(x)- 
a ^ ' a 

Thus fJ,{Ea) < ^ (l -h II/IIlRx)! and letting a — > oo yields estimate (27). 
As we have seen, a final limit as X — > oo then completes the proof. 


5.3 Pointwise ergodic theorem 

The last of the series of limit theorems we will study is the pointwise 
(or individual) ergodic theorem, which combines ideas of the first two 
theorems. At this stage it will be convenient to assume that the measure 
space (X,//) is finite; we can then normalize the measure and suppose 
^i{X) = 1 . 

Theorem 5.4 Suppose f is integrable over X. Then for almost every 
X G X the averages Am{f) = converge to a limit as 

m — > oo. 

Corollary 5.5 If we denote this limit by P'{f), we have that 


\P'{f){x)\dfi{x) < / \f{x)\dn{x). 


' X 


lx 


Moreover P'{f) = P{f) whenever f G LP‘{X,pl). 
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The idea of the proof is as follows. We first show that Am{f) converges 
to a limit almost everywhere for a set of functions / that is dense in 
We then use the maximal theorem to show that this implies 
the conclusion for all integrable functions. 

We remark to begin with that because the total measure of X is 1, we 
have C L^{X,fj,) and ||/||li < II/IIl^, and moreover L‘^{X,^) is 

dense in L^{X,fi). In fact, if / belongs to L^, consider the sequence 
{/n} defined by fn{x) = f{x) if |/(x)| < n, fn{x) = 0 otherwise. Then 
each fn is clearly in L^, while by the dominated convergence theorem 

||/-/n||Li ^0. 

Now starting with an integrable / and any e > 0 we shall see that we 
can write f = F + H, where ||fr||Li < and F = Tg + (1 — T)G, where 
both Fq and G belong to and T{Fq) = Fq, with T(Fo) = Fq{t{x)). To 
obtain this decomposition of /, we first write / = f + h' , where /' € 
and ||h'||Li < e/2, which we can do in view of the density of in V" 
as seen above. Next, since the subspaces S and S\ of Lemma 5.2 are 
orthogonal complements in we can find Fq G S', Fi G Si, such that 
f = Fq + Fi + h with ||/i||l 2 < e/2. Because Fi G Si is automatically 
of the form Fi = (1 — T)G, we obtain f = F + F[, with F = Fg + (1 — 
T)G and H = h + h' . Thus HFUii < ||/i||li + ||h'||Li and since ||/i||li < 
II fill 112 < e/2 we have achieved our desired decomposition of /. 

Now 24™(F) = Am{.FQ) + A^{{1 - T)G) = Fg + ^(1 - r-(G)), as we 
have already seen in the proof of Theorem 5.1. Note that — = 
AG{t^{x)) converges to zero as m — > oo for almost every x G X. In- 
deed, the series Em=i converges almost everywhere by 

the monotone convergence theorem, since its integral over X is 


CXD 

E 

m=l 


/jlir-Glli. 


\\a\\l.Y. 

m=l 


1 

m? ’ 


which is finite. 

As a result, Am{F){x) converges for almost every x G X. Finally, 
to prove the corresponding convergence for Am{f){x), we argue as in 
Theorem 1.3 in Chapter 3 and set 

Ea = {x : lim sup |A„(/)(x) - A™(/)(x)| > a}. 

N^oc n,m>N 

Then it suffices to see that = 0 for all a > 0. However, since 

Anif) - Amif) = A^{E) - A^{E) + A„(F) - A^{H), and A^{F){x) con- 
verges almost everywhere as m — > oo, it follows that almost every point 
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in the set Ea is contained in E'^, where 


E'^ = {x ■. sup \An{H){x) - Am{H){x)\ > a}, 


n^m'>N 


and thus ^i{Ea) < tJ-{E'^) < fj.{{x : 2sup^ \Am{H){x)\ > a}). The last 
quantity is majorized by A/{a/2)\\H\\i^i < 2eAja by Theorem 5.3. Since 
e was arbitrary we see that fJ.{Ea) = 0, and hence Am{f){x) is a Cauchy 
sequence for almost every x, and the theorem is proved. 

To establish the corollary, observe that if / G L‘^{X), we know by 
Theorem 5.1 that Am{f) converges to P{f) in the T^-norm, and hence 
a subsequence converges almost everywhere to that limit, showing that 
P{f) = P'if) in that case. Next, for any / that is merely integrable, we 
have 





and thus since A^if) —>■ P'{f) almost everywhere, we get by Fatou’s 
lemma that \P'{f){x)\ dfj,{x) < \ f{x)\ d^.{x). With this the corol- 

lary is also proved. 

It can be shown that the conclusions of the theorem and corollary are 
still valid if we drop the assumption that the space X has finite measure. 
The modifications of the argument needed to obtain this more general 
conclusion are outlined in Exercise 26. 

5.4 Ergodic measure-preserving transformations 

The adjective “ergodic” is commonly applied to the three limit theorems 
proved above. It also has a related but separate usage describing an 
important class of transformations of the space X. 

We say that a measure-preserving transformation t of X is ergodic 
if whenever E is a measurable set that is “invariant,” that is, E and 
t~^{E) differ by sets of measure zero, then either E or E'^ has measure 
zero. 

There is a useful rephrasing of this condition of ergodicity. Expanding 
the definition used in Section 5.1 we say that a measurable function / 
is invariant if f{x) = f{T{x)) for a.e. x G X. Then r is ergodic exactly 
when the only invariant functions are equivalent to constants. In fact, 
let r be an ergodic transformation, and assume that / is a real- valued in- 
variant function. Then each of the sets Ea = {x \ f{x) > a} is invariant, 
hence //(Eq) = 0 or = 0 for each a. However, if / is not equivalent 
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to a constant, then both n{Ea) and must have strictly positive 

measure for some a. In the converse direction we merely need to note 
that if all characteristic functions of measurable sets that are invariant 
must be constants, then r is ergodic. 

The following result subsumes the conclusion of Theorem 5.4 for er- 
godic transformations. We keep to the assumption of that theorem that 
the underlying space X has measure equal to 1. 

Corollary 5.6 Suppose r is an ergodic measure-preserving transforma- 
tion. For any integrable function f we have 



for a.e. x ^ X as m ^ oo. 


k=0 


The result has the interpretation that the “time average” of / equals its 
“space average.” 

Proof. By Theorem 5.1 we know that the averages Am{f) converge 
to P{f), whenever f where P is the orthogonal projection on the 

subspace of invariant vectors. Since in this case the invariant vectors 
form a one-dimensional space spanned by the constant functions, we 
observe that P(/) = 1(/, 1) = f dpi, where 1 designates the function 
identically equal to 1 on . To verify this, note that P is the identity on 
constants and annihilates all functions orthogonal to constants. Next we 
write any f e as g + h, where g & and ||/i||ii < e. Then P'{f) = 
P'{g) + P'{h). However, we also know that P'{g) = P{g), and ||P'(/i)|| < 
||h||ii < e by the corollary to Theorem 5.4. Thus 



yields that ||P'(/) — f dptW^i < \\g — fW^i e < 2e. This shows that 
P'{f) is the constant f dp, and the assertion is proved. 

We shall now elaborate on the nature of ergodicity and illustrate its 
thrust in terms of several examples. 

a) Rotations of the circle 

Here we take up the example described in (iii) at the beginning of 
Section 5*. On the unit circle R/Z with the induced Lebesgue measure, 
we consider the action r given hy x -\- a mod 1. The result is 

• The mapping r is ergodic if and only if a is irrational. 
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To begin with, if a is irrational we know by the equidistribution theorem 
that 


1 





(29) 


as 7T. — > oo 


for every x if / is continuous on [0, 1] and periodic (/(O) = /(!))• The 
argument used to prove this goes as follows.^ First we verify that (29) 
holds whenever f{x) = n € Z, by considering the cases n = 0 and 

n 7 ^ 0 separately. It then follows that (29) is valid for any trigonometric 
polynomial (a finite linear combination of these exponentials). Finally, 
any continuous and periodic function can be uniformly approximated by 
trigonometric polynomials, so (29) goes over to the general case. 

Now if P is the projection on invariant L^-functions, then Theorem 5.1 
and (29) show that P projects onto the constants, when restricted to the 
continuous periodic functions. Since this subspace is dense in we 
see that P still projects all of on constants; hence the invariant 
functions are constants and thus r is ergodic. 

On the other hand, suppose a = p/q. Choose any set Eq C (0, 1 ^( 7 ), so 
that 0 < m{Eo) < l/q, and let E denote the disjoint union [jl^Q{Eo + 
r/q). Then clearly E is invariant under t : x 1 -^ x + p/q, and 0 < m{E) = 
qm{Ef)) < 1 ; thus r is not ergodic. 

The property (29) we used, which involves the existence of the limit 
at all points, is actually stronger than ergodicity: it implies that the 
measure dpi = dx is uniquely ergodic for this mapping r. That means 
that if V is any measure on the Borel sets of X preserved by r and 
v{X) = 1, then v must equal p. 

To see that this so in the present case, let P^ be the orthogonal projec- 
tion guaranteed by Theorem 5.1, on the space LF‘{X, v). Then (29) shows 
again that the range of Py on the continuous functions, and then on all 
of L‘^{X, v), is the subspace of constants, and thus Pv{f) = fg f du. 

This means also that fg f{x) dx = fg f dv whenever / is continuous 
and periodic. By a simple limiting argument we then get that the mea- 
sure dx = dpL and u agree on all open intervals, and thus on all open 
sets. As we have seen, this then proves that the two measures are then 
identical. 

In general, uniquely ergodic measure-preserving transformations are 
ergodic, but the converse need not be true, as we shall see below. 

b) The doubling mapping 


^See also Section 2, Chapter 4 in Book I. 
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We now consider the mapping x 2x mod 1 for x € (0, 1], with ^ 
Lebesgue measure, that arose in example (iv) at the beginning of Sec- 
tion 5*. We shall prove that r is ergodic and in fact satisfies a different 
and stronger property called mixing.^ It is defined as follows. 

If T is a measure- preserving transformation on the space it is 

said to be mixing if whenever E and F are a pair of measurable subsets 
then 

(30) n F) ^ as n — > oo. 

The meaning of (30) can be understood as follows. In probability theory 
one often encounters a “universe” of possible events to which probabilities 
are assigned. These events are represented as measurable subsets E of 
some space (X, fj,) with fi{X) = 1. The probability of each event is then 
fJ,{E). Two events E and F are “independent” if the probability that 
they both occur is the product of their separate probabilities, that is, 
fj,{E n F) = ^{E)^{F). The assertion (30) of mixing is then that in the 
limit as time n tends to infinity, the sets t“”(F) and F are asymptotically 
independent, whatever the choices of E and F. 

We shall next observe that the mixing condition is implied by the 
seemingly stronger condition 

(31) (T”/, 5 ) ^ (/, I)(l, 5 ) asn^oo, 

where T^{f){x) = f{T^{x)) whenever / and g belong to L‘^{X,fj,). This 
implication follows immediately upon taking / = xe and g = xf- The 
converse is also true, but we leave its proof as an exercise to the reader. 

We now remark that the mixing condition implies the ergodicity of r. 
Indeed, by (31) 

I 

{An{f),g) = - '^{T’"f,g) converges to (/,!)(!, g). 

/c=0 

This means {P{f),g) = (/, 1)(I,5), and hence P{f) is orthogonal to all 
g that are orthogonal to constants. This of course means that P is the 
orthogonal projection on constants, and hence r is ergodic. 

We next observe that the doubling map is mixing. Indeed, if f{x) = 
^ 2 -Kimx ^ gr(a:) = then (/, 1 )(I, 5 ) = 0, unless both m and k are 0, 

in which case this product equals 1. However, in this case {T^f,g) = 
Jg g27rjm2”a:g-27rjfea: y^nigi^gg fgj. sufficiently large u, unless 


^This property is often referred to as a “strongly mixing” to distinguish it from still 
another kind of ergodicity called “weakly mixing.” 
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both m and k are 0, in which case the integral equals 1. Thus (31) 
holds for all exponentials f{x) = g[x) = and therefore by 

linearity for all trigonometric polynomials / and g. It is from there an 
easy step to use the completeness in Chapter 4 to pass to all / and g in 
Z/^((0, 1]) by approximating these functions in the L^-norm by trigono- 
metric polynomials. 

Let us observe that the action of rotations t : x ^ x + a of the unit 
circle for irrational a, although ergodic, is not mixing. Indeed, if we take 
f{x) = g{x) = m 7 ^ 0, then {T'^f,g) = 5 ) = 

while (/, 1) = ( 1 , 5 ) = 0; thus {T^f,g) does not converge to {f,l){l,g) 
as n — > 00 . 

Finally, we note that the doubling map r : a; 1 — > 2x mod 1 on (0, 1] 
is not uniquely ergodic. Besides the Lesbesgue measure, the measure i' 
with i^{l} = 1 but i'{E) = 0 if 1 ^ FI is also preserved by r. 

Further examples of ergodic transformations are given below. 

6* Appendix: the spectral theorem 

The purpose of this appendix is to present an outline of the proof of the spectral 
theorem for bounded symmetric operators on a Hilbert space. Details that are 
not central to the proof of the theorem will be left to the interested reader to fill 
in. The theorem provides an interesting application of the ideas related to the 
Lebesgue-Stieltjes integrals that are treated in this chapter. 

6.1 Statement of the theorem 

A basic notion is that of a spectral resolution (or spectral family) on a Hilbert 
space 74. This is a function A 1 — > E{X) from R to orthogonal projections on 74 that 
satisfies the following: 

(i) E{X) is increasing in the sense that ||i?(A)/|| is an increasing function of A 
for every / C 74. 

(ii) There is an interval [a, 6] such that E{X) = 0 if A < a, and E{X) = I if X > b. 
Here / denotes the identity operator on 74. 

(iii) E{X) is right-continuous, that is, for every A one has 

lim E{g)f = E{X)f for every / G 74. 

> X 


Observe that property (i) is equivalent with each of the following three assertions 
(holding for all pairs A, g with g > X): (a) the range of E{g) contains the range of 
E{X)-, (b) E(g)E{X) = F(A); (c) E{g) — E{X) is an orthogonal projection. 

Now given a spectral resolution {i5(A)} and an element / £ 74, note that the 
function A {E{X)f,f) = ||i?(A)/|p is also increasing. As a result, the polar- 
ization identity (see Section 5 in Chapter 4) shows that for every pair /, gr G 74, 
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the function F{X) = {E{X)f,g) is of bounded variation, and is moreover right- 
continuous. With these two observations we can now state the main result. 

Theorem 6.1 Suppose T is a bounded symmetric operator on a Hilbert space Ti. 
Then there exists a spectral resolution {i5(A)} such that 

T= f XdE{X) 


in the sense that for every f,g£Tl 

(32) {Tf,g) = r Xd{EiX)f,g) = T XdF{X). 

J a J a 

The integral on the right-hand side is taken in the Lebesgue-Stieltjes sense, as 
in (iii) and (iv) of Section 3.3. 

The result encompasses the spectral theorem for compact symmetric operators T 
in the following sense. Let {ifik} be an orthonormal basis of eigenvectors of T with 
corresponding eigenvalues Xk, as guaranteed by Theorem 6.2 in Chapter 4. In this 
case, we take the spectral resolution to be defined via this orthogonal expansion 
by 


7?(A)/~ ^ {f,Tk)Tk, 

and one easily verifies that it satisfies conditions (i), (ii) and (iii) above. We also 
note that \\E{X)ff = and thus T’(A) = (E{X)f,g) isapurejump 

function as in Section 3.3 in Chapter 3. 

6.2 Positive operators 

The proof of the theorem depends on the concept of positivity of operators. We 
say that T is positive, written as T > 0, if T is symmetric and {Tf,f) > 0 for 
all f £Tt. (Note that {Tf,f) is automatically real if T is symmetric.) One then 
writes Ti > T 2 to mean that T\ — T 2 > 0. Note that for two orthogonal projections 
we have E 2 > E\ if and only if ||i? 2 /|| > ||7?i/|| for all f £ Tt, and that is then 
equivalent with the corresponding properties (a) — (c) described above. Notice also 
that if S is symmetric, then = T is positive. Now for T symmetric, let us write 

(33) a = min(T/,/) and fo = max(T/,/) for ||/|| < 1. 

Proposition 6.2 Suppose T is symmetric. Then ||T|| < M if and only if —MI < 
T < MI. As a result, ||T|| = max(|a|, |&|). 

This is a consequence of (7) in Chapter 4. 

Proposition 6.3 Suppose T is positive. Then there exists a symmetric operator 
S (which can he written as T^^^ ) such that S^ = T and S commutes with every 
operator that commutes with T. 
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The last assertion means that if for some operator A we have AT = T A, then 
AS = SA. 

The existence of S is seen as follows. After multiplying by a suitable positive 
scalar, we may assume that ||r|| < 1. Consider the binomial expansion of (1 — 
given by (1 — 1^1 < 1 - The relevant fact that is needed 

here is that the bk are real and \bk\ < oo. Indeed, by direct calculation of 

the power series expansion of (1 — we find that bo — 1, bi — — 1 / 2 , 62 = — 1 / 8 , 

and more generally, bk = —1/2 • 1/2 ■ ■ ■ (k — 3/2)/fc!, if fc > 2, from which it follows 
that bk = Or more simply, since bk < 0 when fc > 1, if we let t ^ 1 in 

the definition, we see that — X^^j bk = 1 and so X^^q Ihjil = 2 . 

Now let Sn(t) denote the polynomial X^^^q bkt^ ■ Then the polynomial 


(34) 


2n 

- (1 - i) = 

k=0 


has the property that X)fc"o I ^ 0 as n ^ cx). In fact, s„{t) = (1 — — r„(t), 

with r„(t) = X)^n+i so siit) - (1 - t) = -r'iit) - 2s„{t)r„{t). Now the left- 
hand side is clearly a polynomial of degree < 2n, and so comparing coefficients with 
those on the right-hand side shows that the Ck are majorized by 3X)j>„ \ bj\ \bk-j\- 
From this it is immediate that J/)*. |c^| = | 6 jj) — > 0 as n ^ 00 , as asserted. 

To apply this, set Ti = I — T\ then 0 < Ti <7, and thus |jTi|| < 1, by Proposi- 
tion 6.2. Let Sn = SniTi) = X)fc=o ^kTi, with T® = 7. Then in terms of operator 
norms, ||S'n - < X))£>mm(n,m) I I ^ 0 US n,m ^ OO, because ||ri''|| < ||Ti||'' < 

1 . Hence Sn converges to some operator S. Clearly Sn. is symmetric for each n, 
and thus S is also symmetric. Moreover, by (34), Sn — T = X)fc=o '-fcT'X therefore 
Il'S'n — r|| < 5/) \Ck \ ^ 0 as n — > oo, which implies that S^ = T. Finally, if A com- 
mutes with T it clearly commutes with every polynomial in T, hence with Sn, and 
thus with S. The proof of the proposition is therefore complete. 

Proposition 6.4 IfTi and T 2 are positive operators that commute, then T 1 T 2 is 
also positive. 

Indeed, if S' is a square root of T\ given in the previous proposition, then T 1 T 2 = 
SST 2 = ST 2 S, and hence (TiTz/, /) = {ST 2 Sf,f) = {T 2 Sf,Sf), since S is sym- 
metric, and thus the last term is positive. 

Proposition 6.5 Suppose T is symmetric and a and b are given by (33). Ifp{t) = 
Ylk^o^^kt^ *'5 ® polynomial which is positive for t £ [a,b], then the operator 
P{T) = X)fe=o ^kT’^ is positive. 

To see this, write p(t) = c\[.{t - pj) W^{p'k - t) n^((t - peY + k't), where c is pos- 
itive and the third factor corresponds to the non-real roots of p(t) (arising in con- 
jugate pairs), and the real roots of pit) lying in (a, 6 ) which are necessarily of 
even order. The first factor contains the real roots pj with pj < a, and the second 
factor the real roots p/ with p/ > b. Since each of the factors T — pjl, p'jl — T 
and (T — pil)^ + v^I is positive and these commute, the desired conclusion follows 
from the previous proposition. 
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Corollary 6.6 If p{t) is a real polynomial, then 

||p(r)|| < sup \p{t)\. 

This is an immediate consequence using Proposition 6.2, since —M < p{t) < M, 
where M = sup^gj^ ^ \p{l)\j E'-nd thus —MI < p(T) < MI. 

Proposition 6.7 Suppose {T„} is a sequence of positive operators that satisfy 
T„ > Tn+i for all n. Then there is a positive operator T, such that Tnf ^ Tf as 
n — > oo for every f £ H. 

Proof. We note that for each fixed f £TL the sequence of positive numbers 
{Tnf,f) is decreasing and hence convergent. Now observe that for any positive 
operator S with US'!! < M we have 

(35) \\Sif)f <iSf,fY^^M^/^\\f\\. 

In fact, the quadratic function {S{tl + S)f, (tl + S)f) = t^iSf, f) + 2t{Sf, Sf) + 
(S^fySf) is positive for all real t. Hence its discriminant is negative, that is, 
l|5(/)f < {Sf, f){S^f, Sf), and (35) follows. We apply this to S = T„-Tm with 
n <m-, then \\Tn — Tm\\ < ||T„|| < ||ri|| = M, and since ((r„ — Tm)f, /) ^ 0 as 
n,m ^ oo we see that ||T„/ — Tmf\\ ^0 as n, m — > oo. Thus lim„^oo Tn{f) = 
T{f) exists, and T is also clearly positive. 


6.3 Proof of the theorem 

Starting with a given symmetric operator T, and with a, b given by (33), we shall 
now exploit further the idea of associating to each suitable function '!> on [a, b] a 
symmetric operator ^{T). We do this in increasing order of generality. First, if 
$ is a real polynomial '^ht^ ■> then, as before, 'i>(T) is defined as X)fe=o CkT’^ . 

Notice that this association is a homomorphism: if $ = <l?i +$ 2 , then 3>(T) = 
4>i(T) + <l? 2 (r); also if <1? = $1 • <1?2, then $(T) = 'l?i(r) • 4>2(T). Moreover, since 
4> is real (and the Ck are real), <l>(r) is symmetric. 

Next, because every real-valued continuous function $ on [a, 6] can be approx- 
imated uniformly by polynomials (see, for instance. Section 1.8, Chapter 5 of 
Book I), we see by Corollary 6.6 that the sequence Pn(T) converges, in the norm of 
operators, to a limit which we call $(r), and moreover this limit does not depend 
on the sequence of polynomials approximating 4>. Also, <l?(r) is automatically a 
symmetric operator. If $(t) > 0 on \a,b] we can always take the approximating 
sequence to be positive on [a, b], and as a result il>(r) > 0. 

Finally, we define ^{T) whenever $ arises as a limit, $(t) = lim„^tx) 'h„(t), 
where {$n(f)} is a decreasing sequence of positive continuous functions on [a, b]. In 
fact, by Proposition 6.7 the limit lim„^oo 'hn(r) exists by what we have established 
above for >1>„. To show that this limit is independent of the sequence {"hn.} and 
thus that <l?(t) is well-defined as the limit above, let be another sequence of 

decreasing continuous functions converging to $. Then whenever e > 0 is given and 
k is fixed, ^^t) < ^k{t) -I- e for all n sufficiently large. Thus ^n{T) < ^k{T) + el 
for these n, and passing to the limit first in n, then in k, and then with e — > 0, we get 
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lim„^oo $n(r) < limfc^oo ^k{T). By symmetry, the reverse inequality holds, and 
the two limits are the same. Note also that for a pair of these limiting functions, 
if < <l> 2 (t) for t e [a,b], then $i(T) < $ 2 (T). 

The basic functions <l>, that give us the spectral resolution are defined 

for each real A by 


= 1 if t < A and = 0 if A < t. 

We note that (p^{t) = lim(p^(t), where (Pn{t) = 1 if t < A, ifinit) = 0 if t > A + 1/n, 
and (Pn{t) is linear for t G [A, A + 1/n]. Thus each is a limit of a decreasing 

sequence of continuous functions. In accordance with the above we set 

E{X) = ^\T). 

Since lim„^oo (t) whenever Ai < A 2 , we see that E{Xi)E{X 2 ) = 

E{Xi). Thus E{X)^ = E{X) for every A, and because E{X) is symmetric it is 
therefore an orthogonal projection. Moreover, for every f GTi, 

||£;(Ai)/|| = ||£(Ai)£(A2)/|| < ||S(A2)/||, 

thus E{X) is increasing. Clearly E{X) = 0 if A < a, since for those A, = 0 on 

[a, 6]. Similarly, E(X) = I for X> b. 

Next we note that E{X) is right-continuous. In fact, fix / G and e > 0. Then 
for some n, which we now keep fixed, \\E{X)f — ipil{T)f\\ < e. However, 
converges to (Pnit) uniformly in t as /r — > A. Hence supj < e, if 

j/i — A| < S, for an appropriate 5. Thus by the corollary < e 

and therefore ||£'(A)/ — (pn(T)\\ < 2e. Now with fi> X we have that E{p,)E{X) = 
E{X) and E(fi)(pl^{T) = E{p,). As a result ||£;(A)/ - E{fi)f\\ < 2e, ii X < p, < X + 
5. Since e was arbitrary, the right continuity is established. 

Finally we verify the spectral representation (32). Let a = Ao < Ai < • • • < Aj, = 
b be any partition of [a, 6] for which supj(Aj — Aj-i) < S. Then since 

k 

t (^) - (t)) + (t) 


we note that 


k 

t < ^ Xj{ip^^{t) — (t)) + XQip^°{t) <t + s. 

Applying these functions to the operator T we obtain 


k 

T<J2 - E{Xj-i)) + XoE{Xo) <T + SI, 
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and thus T differs in norm from the sum above by at most S. As a result 


{Tf, /) - E / d{E{\)f, f) - Xo{E{Xo)f, f) 


<s\\ff. 


But as we vary the partitions of [a, 6], letting their meshes 5 tend to zero, the 
above sum tends to Xd{E{X)f,f). Therefore {Tf,f) = Xd{E{X)f,f), and 
the polarization identity gives (32). 

A similar argument shows that if is continuous on [a, 6], then the operator 
4>(T) has an analogous spectral representation 

(36) ($(T)/,( 7 )= r ^{X)d{E{X)f,g). 

J a~ 

This is because l'l’(t) — (^) “ (t)) — 'l?(Ao)v3^‘’(t)| < <5^ where 

S' = sup|t_t /|<5 l'l’(t) — $(fOI) which tends to zero as d ^ 0. 

This representation also extends to continuous that are complex-valued (by 
considering the real and imaginary parts separately) or for $ that are limits of 
decreasing pointwise continuous functions. 


6.4 Spectrum 

We say that a bounded operator S' on is invertible if S is a bijection of Ji 
and its inverse, S“^, is also bounded. Note that S~^ satisfies S~^S = SS~^ = I. 
The spectrum of S, denoted by o'(S), is the set of complex numbers « for which 
S — zl is not invertible. 

Proposition 6.8 If T is symmetric, then (j{T) is a closed subset of the interval 
[a, b] given by (33). 

Note that if 2 ^ l<i,b], the function ^{t) = {t — z)~^ is continuous on [a, 6] and 
3>(T)(r — zl) = {T — zI)^{T) = I, so $(r) is the inverse of T — zl. Now suppose 
Tq = T — XqI is invertible. Then we claim that To — el is invertible for all (com- 
plex) e that are sufficiently small. This will prove that the complement of o'(T) is 
open. Indeed, To — el = To(/ — eTff^), and we can invert the operator (7 — eTff^) 
(formally) by writing its inverse as a sum 




-iNn+l 


Since5D)))Lo lk”(^o the series converges when |e| < ||T(, ^ 

and the sum is majorized by 


(37) 


IITT 


i-kllirn 


Thus we can define the operator (Tq — e7)“^ as limjv^oo e"(rQ“^)"''t^, 

and it gives the desired inverse, as is easily verified. 

Our last assertion connects the spectrum cr{T) with the spectral resolution 
{E{X)}. 
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Proposition 6.9 For each / £ 7t, the Lebesgue-Stieltjes measure corresponding 
to F{X) = {E{X)f,f) is supported on cr{T). 

To put it another way, F{X) is constant on each open interval of the complement 
of a(r). 

To prove this, let J be one of the open intervals in the complement of (r{T), 
xo € J, and Jo the sub-interval centered at xq of length 2£, with e < ||(T — xqI)~^ || . 
First note that if z has non-vanishing imaginary part then {T — zl)~^ is given by 
-F^T), with $4t) = it-z)-\ Hence {T - zI)-\T -zl)-^ is given by 
with 'itz{t) — l/|t — 2 p. Therefore by the estimate given in (37) and the represen- 
tation (36) applied to <!> = we obtain 


/ 


dF{X) 
\X — z 


< A' 
2 - ^ ’ 


as long as 2 is complex and \xo — 2 | < e. We can therefore obtain the same in- 
equality for x real, |xo — ®| < e. Now integration in a: G Jo using the fact that 
Jj = oo for every A £ J^, gives Jj dF{X) = 0. Thus F{X) is constant in J^, 

but since xq was an arbitrary point of J the function F{X) is constant throughout 
J and the proposition is proved. 


7 Exercises 

1 . Let X be a set and A4 a non-empty collection of subsets of X . Prove that if 
A4 is closed under complements and countable unions of disjoint sets, then M is 
a (T-algebra. 

[Hint: Any countable union of sets can be written as a countable union of disjoint 
sets.) 


2. Let {X,A4,fi) be a measure space. One can define the completion of this 
space as follows. Let M be the collection of sets of the form E U Z, where E £ At, 
and Z <Z E with F G A4 and p{F) = 0. Also, define /l(iJ U Z) = p,{E). Then: 

(a) M is the smallest u-algebra containing A4 and all subsets of elements of M 
of measure zero. 

(b) The function is a measure on M, and this measure is complete. 

[Hint: To prove At is a cr-algebra it suffices to see that if Si C At , then Ef G A4. 
Write El = E U Z with Z C E, E and S in At . Then Sf = (S U E)‘^ U (S — Z).] 

3. Consider the exterior Lebesgue measure m* introduced in Chapter 1. Prove that 
a set E in R"* is Caratheodory measurable if and only if E is Lebesgue measurable 
in the sense of Chapter 1. 

[Hint: If E is Lebesgue measurable and A is any set, choose a Gs set G such 
that A C G and nit {A) = m{G). Conversely, if E is Caratheodory measurable and 
m*(S) < oo, choose a Gs set G with E C G and m*(S) = m,(G). Then G — E 
has exterior measure 0.] 
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4. Let r be a rotation of R'*. Using the fact that the mapping x i— > r{x) preserves 
Lebesgue measure (see Problem 4 in Chapter 2 and Exercise 26 in Chapter 3), show 
that it induces a measure-preserving map of the sphere S'^~^ with its measure da. 

A converse is stated in Problem 4. 

5. Use the polar coordinate formula to prove the following: 

(a) when d = 2. Deduce from this that the same identity 
holds for all d. 

(b) (^/“ dr) = 1, and as a result, a{S‘^-^) = 27r‘'''Vr(d/2). 

(c) If B is the unit ball, Vd = m{B) — 7r'^/^/r(ci/2 -|- 1), since this quantity 
equals ^ dr^ a{S'^~^). (See Exercise 14 in Chapter 2.) 


6. A version of Green’s formula for the unit ball B in R"^ can be stated as follows. 
Suppose u and v are a pair of functions that are in C^{B). Then one has 


L 


{vAu — uAv) dx 



da. 


Here is the unit sphere with da the measure defined in Section 3.2, and 

du/dn, dvjdn denote the directional derivatives of u and v (respectively) along 
the inner normals to 

Show that the above can be derived from Lemma 4.5 of the previous chapter by 
taking rj = and letting e ^ 0. 


7 . There is an alternate version of the mean-value property given in (21) of Chap- 
ter 5. It can be stated as follows. Suppose u is harmonic in Q, and B is any ball 
of center xo and radius r whose closure is contained in D. Then 

u{xo) = c u(xo + ry) da(y), with = ^(S"^"^). 

Conversely, a continuous function satisfying this mean- value property is harmonic. 

[Hint: This can be proved as a direct consequence of the corresponding result 
for averages over balls (Theorem 4.27 in Chapter 5), or can be deduced from 
Exercise 6.] 


8. The fact that the Lebesgue measure is uniquely characterized by its translation 
invariance can be made precise by the following assertion: If /r is a Borel measure 
on R'* that is translation-invariant, and is finite on compact sets, then y is a 
multiple of Lebesgue measure m. Prove this theorem by proceeding as follows. 

(a) Suppose Qa denotes a translate of the cube {x -. Q < Xj < a, j = 1,2, . . . ,d} 
of side length a. If we let fJ,{Qi) = c, then fJ.{Qi/„) = cn~'^ for each integer n. 
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(b) As a result /r is absolutely continuous with respect to m, and there is a 
locally integrable function / such that 

m(-E) = [ fdx. 

J E 

(c) By the differentiation theorem (Corollary 1.7 in Chapter 3) it follows that 
f{x) = c a.e., and hence /r = cm. 

[Hint: Qi can be written as a disjoint union of translates of Qi/n-] 


9. Let C{[a, b]) denote the vector space of continuous functions on the closed and 
bounded interval [a,fe]. Suppose we are given a Borel measure /r on this interval, 
with ^([a,6]) < 00 . Then 


/ iif) = [ fix) dfj.{x) 

J a 

is a linear functional on C{[a, 6]), with i positive in the sense that l{f) > 0 if / > 0. 

Prove that, conversely, for any linear functional £ on C{[a, 6]) that is positive in 
the above sense, there is a unique finite Borel measure /r so that £(f) = J f djj, for 
fGC{[a,b]). 

[Hint: Suppose a = 0 and u > 0. Define F{u) by F{u) = lime^o^(/e), where 

r ^ / 1 for 0 < a; < u, 

\ 0 for u + e < a, 

and fe is linear between u and u + e. (See Figure 3.) Then F is increasing and 
right-continuous, and £{f) can be written as fix) dF{x) via Theorem 3.5.] 

The result also holds if [a, b] is replaced by a closed infinite interval; we then 
assume that i is defined on the continuous functions of bounded support, and 
obtain that the resulting fi is finite on all bounded intervals. 

A generalization is given in Problem 5. 

10. Suppose v,u\,V 2 are signed measures on {X,A4) and /r a (positive) measure 
on M. Using the symbols T and ^ defined in Section 4.2, prove: 

(a) If T ^ and 1^2 T /r, then vi + 122 F jj,. 

(b) If 1^1 <C and 122 <C M, then + U 2 fj.. 

(c) ui ± V 2 implies \vi\ T |i' 2 |. 

(d) V < \v\. 

(e) \i V F n and v jj., then u — 0. 


11 . Suppose that F is an increasing normalized function on R, and let F = 
Fa -b Fc + Fj be the decomposition of F in Exercise 24 in Chapter 3; here Fa is 
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Figure 3. The function fe in Exercise 9 


absolutely continuous, Fc is continuous with Fh = 0 a.e, and Fj is a pure jump 
function. Let fj, — with fi, fiA, MCi and the Borel measures 

associated to F, Fa, Fc, and Fj, respectively. Verify that: 

(i) Ha is absolutely continuous with respect to Lebesgue measure and ha{E) = 

F'(x) dx for every Lebesgue measurable set E. 

(ii) As a result, if F is absolutely continuous, then J f dfj, = J f dF — 
f f{x)F'(x)dx whenever / and fF' are integrable. 

(iii) He + fJ-j and Lebesgue measure are mutually singular. 


12. Suppose R'* — {0} is represented as R+ x with R+ = {0 < r < oo}. 

Then every open set in R"^ — {0} can be written as a countable union of open 
rectangles of this product. 

[Hint: Consider the countable collection of rectangles of the form 

{rj < r < X {7 G : [7 - -ii\ < 1 /n}. 

Here Vj and range over all positive rationals, and {7^} is a countable dense set 
of 


13. Let rrij be the Lebesgue measure for the space R'^j , j — 1,2. Consider the 
product R'* = R'*! x R'*^ (d = di + ^2), with m the Lebesgue measure on R"^. Show 
that m is the completion (in the sense of Exercise 2) of the product measure 

mi X m2. 


14. Suppose {Xj,A4j,Hj), 1 < j < is a finite collection of measure spaces. 
Show that parallel with the case k = 2 considered in Section 3 one can construct 
a product measure Hi ^ x x ^J'k on X = X^ x X2 x • ■ ■ x X^- In fact, for 
any set E C X such that E = Ei x E 2 x ■ ■ ■ x E^, with Ej C Mj for all j, define 
HoiE) = rij=i k'jiEj). Verify that Ho extends to a premeasure on the algebra A 
of finite disjoint unions of such sets, and then apply Theorem 1.5. 
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15 . The product theory extends to infinitely many factors, under the requisite 
assumptions. We consider measure spaces {Xj , A4j , fij) with ij,j{Xj) = 1 for all 
but finitely many j. Define a cylinder set E as 

{x = (xj), Xj € Ej, Ej e Mj, but Ej = Xj for all but finitely many j}. 

For such a set define Ato(-E) = Il^i H i® algebra generated by the 

cylinder sets, fio extends to a premeasure on .4, and we can apply Theorem 1.5 
again. 

16 . Consider the d-dimensional torus Identify as x • ■ • x 

(d factors) and let /r be the product measure on T"* given by /r = x x • • • x fJ-d, 
where is Lebesgue measure on Xj identified with the circle T. That is, if we 
represent each point in Xj uniquely as Xj with 0 < Xj < 1, then the measure fj,j is 
the induced Lebesgue measure on restricted to (0, 1]. 

(a) Check that the completion is Lebesgue measure induced on the cube 
Q = {x ■. 0 < Xj < 1, j = 1, . . . ,d}. 

(b) For each function f on Q let / be its extension to which is periodic, that 
is, f{x + z) = f{x) for every z G Z^. Then / is measurable on if and 
only if / is measurable on R"^, and / is continuous on if and only if / is 
continuous on R'*. 

(c) Suppose / and g are integrable on T'*. Show that the integral defining 
if * 5)(®) = fjd /(® ~ y)g{y) dy is finite for a.e. x, that f * g is integrable 
over T'*, and that f * g ~ g * f ■ 

(d) For any integrable function / on T'*, write 


/ \ ' 2Trin-x 

~ 2_^ ane 
neZ'i 

to mean that a„ = f dx. Prove that if g is also integrable, 

and g ~ I]„g 2 d then 

/ \ ' 7 27rin-rc 

* y ~ 2^ anOnC 


(e) Verify that is an orthonormal basis for L^(T‘*). As a result 

ll/lli^CT'*) = X^ngZ'i l®"l ■ 

(f) Let / be any continuous periodic function on T'^. Then / can be uniformly 

approximated by finite linear combinations of the exponentials . 

[Hint: For (e), reduce to the case d = 1 by Fubini’s theorem. To prove (f) let 
g{x) = ge{x) = iiO < Xj < €, j = 1, . . . ,d, and ge{x) = 0 elsewhere in Q. Then 
(/ * S'e)(®) ^ /(®) uniformly as e — > 0. However (/ * ge){x) = X) with 

bn = fjd dx, and X) \anb„\ < oo.] 
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17 . By reducing to the case d = 1, show that each “rotation” x x -\- a of the 

torus T'* = is measure preserving, for any a £ R'^. 

18 . Suppose r is a measure-preserving transformation on a measure space (X, /r) 

with /i(X) = 1. Recall that a measurable set E is invariant if and E differ 

by a set of measure zero. A sharper notion is to require that t~^{E) equal E. 
Prove that if E is any invariant set, there is a set E' so that E' = t~^{E'), and E 
and E' differ by a set of measure zero. 

[Hint: Let E' = limsup„^^{r-"(S)} = n“=o (Ui,>n ■] 

19 . Let T be a measure-preserving transformation on {X,fj,) with fJ,{X) = 1. Then 
r is ergodic if and only if whenever v is absolutely continuous with respect to jj, and 
u is invariant (that is, v(t~^ {E)) = v{E) for all measurable sets E), then v = c/r, 
with c a constant. 

20 . Suppose T is a measure-preserving transformation on (X, ^). If 

as n ^ oo for all measurable sets E and F, then {T”'f,g) (/, l)(l,gr) whenever 

/, 5 £ L^{X) with {T f){x) = f{T(x)). Thus r is mixing. 

[Hint: By linearity the hypothesis implies the conclusion whenever / and g are 
simple functions. [ 

21 . Let be the torus, and t ■. x x E a. the mapping arising in Exercise 17. 
Then r is ergodic if and only if a = (ai, . . . , ad) with ai, 02 , . . • , Qd, and 1 are 
linearly independent over the rationals. To do this show that: 

1 r 

(a) — /(r*(a;)) — > / f{x) da; as m — > 00 , for each x £ T"*, whenever / is 

continuous and periodic and a satisfies the hypothesis. 

(b) Prove as a result that in this case r is uniquely ergodic. 

[Hint: Use (f) in Exercise 16. [ 

22 . Let X = n^i where each {Xi,gi) is identical to (Xi, gi), with gi (Xi) = 
1, and let g be the corresponding product measure defined in Exercise 15. Define 
the shift r : X ^ X by T((a;i,a; 2 , •••)) = (* 2 , 3 : 3 , . . .) for a: = (xi) £ n“i Xi. 

(a) Verify that r is a measure-preserving transformation. 

(b) Prove that r is ergodic by showing that it is mixing. 

(c) Note that in general r is not uniquely ergodic. 

If we define the corresponding shift on the two-sided infinite product, then r is 
also a measure-preserving isomorphism. 
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[Hint: For (b) note that n F)) — fi{E)fi{F) whenever E and F are cylin- 

der sets and n is sufficiently large. For (c) note that, for example, if we fix a point 
X e Xi, the set E = {{xi) : Xj =x all j} is invariant.] 

23. Let X = n^i ^( 2)5 where each factor is the two-point space Z{2) = {0, 1} 
with /ii(0) = = 1/2, and suppose jj, denotes the product measure on X. Con- 
sider the mapping D : X ^ [0, 1] given by D{{aj}) |y. Then there are 

denumerable sets Xi C X and Z 2 C [0, 1], such that: 

(a) D is a bijection from X — Xi to [0, 1] — ^ 2 . 

(b) A set i? in X is measurable if and only if D{E) is measurable in [0, 1], and 
fj,{E) = m{D{E)), where m is Lebesgue measure on [0, Ij. 

(c) The shift map on ^(2) then becomes the doubling map of example (b) 
in Section 5.4. 


24. Consider the following generalization of the doubling map. For each integer 
m, m > 2, we define the map Tm of (0, 1] by t{x) = mx mod 1. 


(a) Verify that r is measure-preserving for Lebesgue measure. 

(b) Show that r is mixing, hence ergodic. 

(c) Prove as a consequence that almost every number x is normal in the scale m, 
in the following sense. Consider the m-adic expansion of x, 


E o-i 
ml ’ 

j=i 


where each aj is an 


Then x is normal if for each integer k, 0 < 

#{j ■ aj = k, l<j< n} 

N ^ 


Note the analogy with the equidistribution 
ter 4, of Book 1. 


integer 0 < Oj < m — 1. 


k < m — 1, 

1 

— as A — > 00 . 
m 

statements in Section 2, Chap- 


25. Show that the mean ergodic theorem still holds if we replace the assumption 
that T is an isometry by the assumption that T is a contraction, that is, ||r/|| < 
ll/ll for all f£H. 

[Hint: Prove that T is a contraction if and only if T* is a contraction, and use the 
identity (/,r*/) = (r/,/).] 

26. There is an version of the maximal ergodic theorem. Suppose r is a 
measure-preserving transformation on (X, fj,). Here we do not assume that /r(X) < 
00 . Then 

m — 1 

fix) = sup — ^ l/(^''(2:))l 
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satisfies 


II/*IIl2(x) < c||/||i 2 (jf), whenever / G L^iX). 

The proof is the same as outlined in Problem 6, Chapter 5 for the maximal function 
on R'*. With this, extend the pointwise ergodic theorem to the case where — 
00 , as follows: 

(a) Show that limm^oo ^ converges for a.e. x to P{f){x) for 

every / G L^{X), because this holds for a dense subspace of L^{X). 

(b) Prove that the conclusion holds for every / G L^{X), because it holds for 
the dense subspace L^{X) n Lp'{X). 


27 . We saw that if II /„ 11^2 < 1, then ^ 0 as n ^ oo for a.e. x. However, show 

that the analogue where one replaces the L^-norm by the L^-norm fails, by con- 
structing a sequence {/n}, /n G L'-^X), H/nlLi < li but with limsup„^„^ = 

00 for a.e. x. 

[Hint: Find intervals C [0, 1], so that m(7„) = l/(n log n) but limsup^^,^{7„} = 
[0, 1]. Then take fn{x) = nlognx/„.] 

28 . We know by the Borel-Cantelli lemma that if {7?„} is a collection of measurable 

sets in a measure a space {X, fi) and fJ-iEn) < oo then E = limsup„^^{7?n,} 

has measure zero. 

In the opposite direction, if r is a mixing measure-preserving transformation 
on X (with IJ,{X) = 1), then whenever jJ,{En) = oo, there are integers m = 

rrin so that if E'„ = r“™"(75„), then limsup^^,3^(i5(,) = X, except for a set of 
measure 0. 


8 Problems 

1 . Suppose $ is a bijection of an open set O in R'* onto another open set O' 
in R”^. 

(a) If 75 is a measurable subset of O, then 4>(7?) is also measurable. 

(b) m($(i5)) = I det 4>^(a;)| dx, where is the Jacobian of <I>. 

(c) Jq, f{y) dy = Jg f{${x)) \ det $'(a;)| dx whenever / is integrable on O' . 

[Hint: To prove (a) follow the argument in Exercise 8, Chapter 1. For (b) assume 
75 is a bounded open set, and write 75 as [JJLiQj, where Qj are cubes whose 
interiors are disjoint, and whose diameters are less than e. Let be the center of 
Qfe. Then if a; G Qk, 


<E>(a;) = ^{zk) + ^'{zk){x - Zk) + o{t) 
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hence $(Qfe) = $( 2 ;*;) + ^'{zk){Qk — Zk) + o(e), and as a result (1 — T]{e))^' {zk){Qk 
Zk) C 3>(Qfc) — ^{zk) C (1 + ■q{e))^' {zk){Qk — Zk), where ri{e) 0 as e ^ 0. This 
means that 

m(<l>(C>)) = ^ m{^{Qk)) = ^ I det(<l>'( 2 fe))| m{Qk) + o(l) as e ^ 0 

k k 

on account of the linear transformation property of the Lebesgue measure given in 
Problem 4 of Chapter 2. Note that (b) is (c) for /(<i?(x)) = xe{x).\ 

2. Show as a consequence of the previous problem: the measure in the 

upper half-plane R+ = {z = x + iy, y > 0} is preserved by any fractional linear 

transformation 2 where ^ ^ ^ belongs to SL 2 (R). 

3. Let S' be a hypersurface in = R”^”^ x R, given by 

S = {{x,y) xR y ^ F{x)}, 

with F a. function defined on an open set Q, in For each subset E G Q, 

we write E for the corresponding subset of S given by i? = {(a:, F{x)) x € E}. We 
note that the Borel sets of S can be defined in terms of the metric on S (which is 
the restriction of the Euclidean metric on R'^). Thus if E is a Borel set in fl, then 
E is a Borel subset of S. 

(a) Let n be the Borel measure on S given by 

y{E) = f y/l + \VF\2dx. 

J E 

If -B is a ball in fl, let = {{x,y) G R'*, d{{x,y), B) < <5}. Show that 

where m denotes the d-dimensional Lebesgue measure. This result is anal- 
ogous to Theorem 4.4 in Chapter 3. 

(b) One may apply (a) to the case when S is the (upper) half of the unit sphere 

in R”^, given hy y = F(x), E{x) = (1 — \x\^y^^, \x\ < 1, x G Show 

that in this case dfj, — da, the measure on the sphere arising in the polar 
coordinate formula in Section 3.2. 

(c) The above conclusion allows one to write an explicit formula for da in 

terms of spherical coordinates. Take, for example, the case d = 3, and 
write y = cos 9, x = {x\,X 2 ) = (sin 9 cos ip, sin 9 sin ip) with 0 < d < 7r/2, 0 < 
p < 27r. Then according to (a) and (b) the element of area da equals 
(1 — dx. Use the change of variable theorem in Problem 1 to deduce 

that in this case da = sin 9 d9 dp. This may be generalized to d dimensions, 
d > 2, to obtain the formulas in Section 2.4 of the appendix in Book 1. 
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4.* Let /i be a Borel measure on the sphere which is rotation-invariant in the 
following sense: ii{r{E)) = for every rotation r of R'* and each Borel subset 

E of . If < 00 , then ^ is a constant multiple of the measure cr arising 

in the polar coordinate integration formula. 

[Hint: Show that 



Yh{x) dfi{x) = 0 


for every surface spherical harmonic of degree fc > 1. As a result, there is a constant 
c so that 




da 


for every continuous function / on ^.] 


5. * Suppose X is a metric space, and /r is a Borel measure on X with the property 
that fJ,{B) < 00 for every ball B. Define Co(X) to be the vector space of continuous 
functions on X that are each supported in some closed ball. Then £{f) = f d^ 
defines a linear functional on C'o(X) that is positive, that is, £{f) > 0 if / > 0. 

Conversely, for any positive linear functional £ on C'o(X), there exists a unique 
Borel measure /r that is finite on all balls, such that £{f) = J f dfj,. 

6. Consider an automorphism A of T'* = R'^/Z'^, that is, A is a linear isomorphism 
of R”^ that preserves the lattice Z'*. Note that A can be written as a d x d matrix 
whose entries are integers, with det A = ±1. Define the mapping r : T'* ^ T'* by 
t{x) = A{x). 

(a) Observe that r is a measure- preserving isomorphism of T"*. 

(b) Show that r is ergodic (in fact, mixing) if and only if A has no eigenvalues 

of the form where p and q are integers. 

(c) Note that r is never uniquely ergodic. 

[Hint: The condition (b) is the same as {A^Y has no invariant vectors, where A* is 
the transpose of A. Note also that /(T*^(a;)) = ^ where fix) = ] 


7 .* There is a version of the maximal ergodic theorem that is akin to the “rising 
sun lemma” and Exercise 6 in Chapter 3. 

Suppose / is real-valued, and f*ix) = sup,.„ ^ Let Eq = {x ■. 

Y^ix) > 0}. Then 



dx > 0. 


As a result (when we apply this to fix) — a), we get when / > 0 that 


p{x : 


fix) > a} < ^ [ 

a Ji 




fix)dx. 


322 


Chapter 6. ABSTRACT MEASURE AND INTEGRATION THEORY 


In particular, the constant A in Theorem 5.3 can be taken to be 1. 

8. Let X = [0, 1), t(x) = (1/*), * 7 ^ 0, r(0) = 0. Here (x) denotes the fractional 

part of X. With the measure dfj, = have of course = 1. 

Show that r is a measure-preserving transformation. 

[Hint. (a; + fc){ 2 !-|-fe + l) “ 1+x '1 

9. * The transformation r in the previous problem is ergodic. 

10. * The connection between continned fractions and the transformation t{x) = 
(l/x) will now be described. A continued fraction, ao + l/(ai + I/U 2 ) • ■ • , also 
written as [000102 • • • ], where the Oj are positive integers, can be assigned to any 
positive real number x in the following way. Starting with x, we successively 
transform it by two alternating operations: reducing it modulo 1 to lie in [0, 1), 
and then taking the reciprocal of that number. The integers Oj that arise then 
define the continued fraction of x. 

Thns we set x = ao + ro, where 00 = [x] = the greatest integer in x, and ro £ 
[0, 1). Next we write 1/ro = oi -|- ri, with oi = [1/ro], ri £ [0, 1), to obtain suc- 
cessively l/r„_i = o„ -I- r„, where o„ = [1/rn-i], £ [0, 1). If r„ = 0 for some n, 

we write Ofe = 0 for all k > n, and say that such a continued fraction terminates. 

Note that if 0 < 2 ; < 1, then rg = x and Oi = [1/a;], while ri = (1/a;) = t{x). 
More generally then, ak{x) = [l/r*’“^ (a;)] = air*^“^(a;). The following properties 
of continued fractions of positive real numbers x are known: 

(a) The continued fraction of x terminates if and only if x is rational. 

(b) If a; = [aooi ■ ■ ■ a-n- ■ ■], and xn = [uoui • • • ajvOO • • • ], then xn x as N —> 
00 . The sequence {a;jv} gives essentially an optimal approximation of x by 
rationals. 

(c) The continued fraction is periodic, that is, ak+N = a-k for some > 1, and 
all sufRciently large k, if and only if x is an algebraic number of degree < 2 
over the rationals. 

(d) One can conclude that ‘^i+‘^ 2 +---+an qq g^g ^ qq fQj. almost every x. In 
particular, the set of numbers x whose continued fractions [aoui • • • o„ • ■ ■ ] 
are bounded has measure zero. 

[Hint: For (d) apply a consequence of the pointwise ergodic theorem, which is as 
follows: Suppose / > 0, and / f dfi = 00 . If r is ergodic, then ^ ^ 

00 for a.e. x as m ^ 00 . In the present case take f{x) = [l/a:].[ 



Hausdorff Measure and 
Fractals 


Caratheodory developed a remarkably simple general- 
ization of Lebesgue’s measure theory which in particu- 
lar allowed him to define the p-dimensional measure of 
a set in g-dimensional space. In what follows, I present 
a small addition.... a clarification of p-dimensional 
measure that leads immediately to an extension to 
non-integral p, and thus gives rise to sets of fractional 
dimension. 

F. Hausdorff, 1919 


I coined fractal from the Latin adjective fractus. The 
corresponding Latin verb frangere means to “break” : 
to create irregular fragments. 

B. Mandelbrot, 1977 


The deeper study of the geometric properties of sets often requires 
an analysis of their extent or “mass” that goes beyond what can be 
expressed in terms of Lebesgue measure. It is here that the notions 
of the dimension of a set (which can be fractional) and an associated 
measure play a crucial role. 

Two initial ideas may help to provide an intuitive grasp of the concept 
of the dimension of a set. The first can be understood in terms of how 
the set replicates under scalings. Given the set E, let us suppose that 
for some positive number n we have that nE = Ei U ■ ■ ■ U Em, where the 
sets Ej are m essentially disjoint congruent copies of E. Note that if 
E were a line segment this would hold with m = n; if E were a square, 
we would have m = if E were a cube, then m = n^; etc. Thus, more 
generally, we might be tempted to say that E has dimension a if m = n" . 
Observe that if E is the Cantor set C in [0, 1], then 3C consists of 2 copies 
of C, one in [0, 1] and the other in [2, 3]. Here n = 3, m = 2, and we would 
be led to conclude that log 2/ log 3 is the dimension of the Cantor set. 
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Another approach is relevant for curves that are not necessarily rec- 
tifiable. Start with a curve T = '■ a <t < b}, and for each e > 0 

consider polygonal lines joining 7(a) to 7(6), whose vertices lie on suc- 
cessive points of r, with each segment not exceeding e in length. Denote 
by #(e) the least number of segments that arise for such polygonal lines. 
If #(e) « e~^ as e — > 0, then T is rectifiable. However, #(e) may well 
grow more rapidly than as e — > 0. If we had #(e) ~ e““, I < a, 
then, in the spirit of the previous example, it would be natural to say 
that r has dimension a. These considerations have even an interest in 
other parts of science. For instance, in studying the question of determin- 
ing the length of the border of a country or its coastline, L.F. Richardson 
found that the length of the west coast of Britain obeyed the empirical 
law #(e) w e““, with a approximately 1.5. Thus one might conclude 
that the coast has fractional dimension! 

While there are a number of different ways to make some of these 
heuristic notions precise, the theory that has the widest scope and great- 
est flexibility is the one involving Hausdorff measure and Hausdorff di- 
mension. Probably the most elegant and simplest illustration of this 
theory can be seen in terms of its application to a general class of self- 
similar sets, and this is what we consider first. Among these are the 
curves of von Koch type, and these can have any dimension between 1 
and 2. 

Next, we turn to an example of a space-filling curve, which, broadly 
speaking, falls under the scope of self-replicating constructions. Not 
only does this curve have an intrinsic interest, but its nature reveals the 
important fact that from the point of view of measure theory the unit 
interval and the unit square are the same. 

Our final topic is of a somewhat different nature. It begins with the 
realization of an unexpected regularity that all subsets of (of finite 
Lebesgue measure) enjoy, when d > 3. This property fails in two di- 
mensions, and the key counter-example is the Besicovitch set. This set 
appears also in a number of other problems. While it has measure zero, 
this is barely so, since its Hausdorff dimension is necessarily 2. 

1 Hausdorff measure 

The theory begins with the introduction of a new notion of volume or 
mass. This “measure” is closely tied with the idea of dimension which 
prevails throughout the subject. More precisely, following Hausdorff, 
one considers for each appropriate set E and each a > 0 the quantity 
ma{E), which can be interpreted as the a- dimensional mass of E among 
sets of dimension a, where the word “dimension” carries (for now) only 
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an intuitive meaning. Then, if a is larger than the dimension of the set 
E, the set has a negligible mass, and we have ma{E) = 0. If a is smaller 
than the dimension of E, then E is very large (comparatively), hence 
ma{E) = Qo. For the critical case when a is the dimension of E, the 
quantity ma{E) describes the actual a- dimensional size of the set. 

Two examples, to which we shall return in more detail later, illustrate 
this circle of ideas. 

First, recall that the standard Cantor set C in [0, 1] has zero Lebesgue 
measure. This statement expresses the fact that C has one-dimensional 
mass or length equal to zero. However, we shall prove that C has a 
well-defined fractional Hausdorff dimension of log 2/ log 3, and that the 
corresponding Hausdorff measure of the Cantor set is positive and Unite. 

Another illustration of the theory developed below consists of starting 
with F, a rectifiable curve in the plane. Then F has zero two-dimensional 
Lebesgue measure. This is intuitively clear, since F is a one-dimensional 
object in a two-dimensional space. This is where the Hausdorff measure 
comes into play: the quantity mi(F) is not only finite, but precisely equal 
to the length of F as we defined it in Section 3.1 of Chapter 3. 

We first consider the relevant exterior measure, defined in terms of 
coverings, whose restriction to the Borel sets is the desired Hausdorff 
measure. 

For any subset E of M'^, we define the exterior a-dimensional Haus- 
dorff measure of E by 

{ OO 

y~](diam : FI C F)s, diam E^ < 5 all k 

k k=i 

where diam S denotes the diameter of the set S, that is, diam S = 
sup{|x — y\ ■ x,y S}. In other words, for each d > 0 we consider covers 
of E by countable families of (arbitrary) sets with diameter less than <5, 
and take the infimum of the sum ^^(diam Ffe)“. We then define m*^{E) 
as the limit of these infimums as 5 tends to 0. We note that the quantity 

{ OO 

J^(diam EkT : ^ C U Ek, diam E^ < 5 all k 

k k=i 

is increasing as 5 decreases, so that the limit 

ml{E) = \\u,ni{E) 

d— 

exists, although m*^{E) could be infinite. We note that in particu- 
lar, one has H^^{E) < m*^{E) for all <5 > 0. When defining the exte- 
rior measure m*^{E) it is important to require that the coverings be of 
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sets of arbitrarily small diameters; this is the thrust of the definition 
m* (-B) = lim^^o This requirement, which is not relevant for 
Lebesgue measure, is needed to ensure the basic additive feature stated 
in Property 3 below. (See also Exercise 12.) 

Scaling is the key notion that appears at the heart of the definition of 
the exterior Hausdorff measure. Loosely speaking, the measure of a set 
scales according to its dimension. For instance, if P is a one-dimensional 
subset of say a smooth curve of length L, then rP has total length 
rL. If Q is a cube in R'*, the volume of rQ is r‘^\Q\. This feature is 
captured in the definition of exterior Hausdorff measure by the fact that 
if the set F is scaled by r, then (diam E)“ scales by r". This key idea 
reappears in the study of self-similar sets in Section 2.2. 

We begin with a list of properties satisfied by the Hausdorff exterior 
measure. 

Property 1 (Monotonicity) If Ei C E 2 , then m*(Ei) < m*(£' 2 ). 

This is straightforward, since any cover of E 2 is also a cover of Ei. 

Property 2 (Sub-additivity) m*(|J^^Ej) < for any 

countable family {Ej} of sets in 

For the proof, fix 6, and choose for each j a cover of Ej by 

sets of diameter less than 5 such that ^^.(diam Ej^k)°‘ < H^iEj) + e/2^. 
Since (J^. j. Ej^^ is a cover of E by sets of diameter less than 5, we must 
have 


ni{E)<Y,ni{E,) + e 

00 

< ^m;(Fj)-he. 
i=i 

Since e is arbitrary, the inequality T-lf^{E) < ^ m*^{Ej) holds, and we let 
5 tend to 0 to prove the countable sub- additivity of m* . 

Property 3 If d{Ei,E 2 ) > 0, then m*(Fi UF 2 ) = m*(Fi) -|- m*(F 2 ). 

It suffices to prove that m* (Fi U E 2 ) > m* (Fi) -|- m* (F 2 ) since the re- 
verse inequality is guaranteed by sub- additivity. Fix e > 0 with e < 
(i(Fi, F 2 ). Given any cover of Ei U E2 with sets Fi, F 2 . . . , of diame- 
ter less than 5, where 5 < e, we let 


F) = Fi n Fj and F" = F 2 D Fj . 
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Then {F-} and {F^} are covers for Ei and E 2 , respectively, and are 
disjoint. Hence, 

y^(diani + ^^(diam F")“ < ^^(diam Ffe)“. 

j i k 

Taking the infimum over the coverings, and then letting 5 tend to zero 
yields the desired inequality. 

At this point, we note that m* satisfies all the properties of a metric 
Caratheodory exterior measure as discussed in Chapter 6. Thus m* 
is a countably additive measure when restricted to the Borel sets. We 
shall therefore restrict ourselves to Borel sets and write ma{E) instead 
of m*^{E). The measure is called the a-dimensional Hausdorff 
measure. 

Property 4 If {Ej} is a countable family of disjoint Borel sets, and 
E = U^i then 

00 

ma{E) = y^ma{Ej). 
i=i 

For what follows in this chapter, the full additivity in the above prop- 
erty is not needed, and we can manage with a weaker form whose proof 
is elementary and not dependent on the developments of Chapter 6. (See 
Exercise 2.) 

Property 5 Hausdorff measure is invariant under translations 
mffE + h) = ma{E) for all h G 

and rotations 

mairE) = mffE), 

where r is a rotation in 

Moreover, it seales as follows: 

mffXE) = X^mffE) for all A > 0. 

These conclusions follow once we observe that the diameter of a set S 
is invariant under translations and rotations, and satisfies diam(AS') = 
Adiam(S') for A > 0. 

We describe next a series of properties of Hausdorff measure, the first 
of which is immediate from the definitions. 
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Property 6 The quantity mo{E) counts the number of points in E, 
while mi{E) = m{E) for all Borel sets E cM.. (Here m denotes the 
Lebesgue measure on R. j 

In fact, note that in one dimension every set of diameter 5 is contained in 
an interval of length <5 (and for an interval its length equals its Lebesgue 
measure) . 

In general, d-dimensional Hausdorff measure in is, up to a constant 
factor, equal to Lebesgue measure. 

Property 7 If E is a Borel subset of R“*, then Cdmd{E) = m{E) for 
some constant Cd that depends only on the dimension d. 

The constant Cd equals m(i?)/(diam B)'^, for the unit ball B; note that 
this ratio is the same for all balls B in R'^, and so Cd = Vdf2'^ (where Vd 
denotes the volume of the unit ball). The proof of this property relies on 
the so-called iso-diametric inequality, which states that among all sets of 
a given diameter, the ball has largest volume. (See Problem 2.) Without 
using this geometric fact one can prove the following substitute. 

Property 7 ' If E is a Borel subset of R'^ and m{E) is its Lebesgue 
measure, then md{E) w m{E), in the sense that 

Cdmd{E) < m{E) < 2’^Cdmd{E). 

Using Exercise 26 in Chapter 3 we can find for every e, <5 > 0, a covering 
of E by balls {Bj}, such that diam Bj < S, while m{Bj) < m{E) -|- e. 
Now, 


n^i.E) < ^(diam B^f = ^ ^ m{Bj) < ^(m(E) -h e). 

3 3 

Letting 5 and e tend to 0, we get md{E) < c(f^m{E). For the reverse 
direction, let E C IJ^- Ej be a covering with ^^ (diam Ej)'^ < md{E) -|- e. 
We can always find closed balls Bj centered at a point of Ej so that 
Bj D Ej and diam Bj = 2 diam Ej. However, m{E) < '^j 'm{Bj), since 
E C Bj , and the last sum equals 

Cd(diam Bj)'^ = 2‘^Cd ^(diam Ej^ < 2‘^Cd {md{E) + e) . 

Letting e — > 0 gives m{E) < 2‘^Cdmd{E). 

Property 8 Ifm*^{E) < oo and (3 > a, thenm*^{E) = 0. Also, ifml^{E) > 
0 and 3 < a, then m*^{E) = oo. 
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Indeed, if diam F < 5, and /? > a, then 


(diam Ff = (diam F)^““(diam < 5^““(diam F)“. 


Consequently 




Since m* (-B) < oo and /3 — a > 0, we find in the limit as S tends to 0, 
that m*^{E) = 0. 

The contrapositive gives m*^{E) = oo whenever m*^{E) > 0 and (3 < a. 

We now make some easy observations that are consequences of the 
above properties. 

1. If / is a finite line segment in then 0 < m\{I) < oo. 

2. More generally, if Q is a fc-cube in (that is, Q is the product of 
k non-trivial intervals and d — k points), then 0 < rnk{Q) < oo. 

3. If C* is a non-empty open set in then ma{0) = oo whenever 
a < d. Indeed, this follows because md{0) > 0. 

4. Note that we can always take a < d. This is because when a > d, 
ma vanishes on every ball, and hence on all of 

2 Hausdorff dimension 

Given a Borel subset E of we deduce from Property 8 that there 
exists a unique a such that 



In other words, a is given by 

a = sup{/3 : mp{E) = oo} = inf{/3 : mis{E) = 0}. 


We say that E has Hausdorff dimension a, or more succinctly, that 
E has dimension a. We shall write a = dim E. At the critical value a 
we can say no more than that in general the quantity ma{E) satisfies 
0 < ma{E) < oo. If E is bounded and the inequalities are strict, that is, 
0 < ma{E) < oo, we say that E has strict Hausdorff dimension a. 
The term fractal is commonly applied to sets of fractional dimension. 

In general, calculating the Hausdorff measure of a set is a difficult 
problem. However, it is possible in some cases to bound this measure 
from above and below, and hence determine the dimension of the set in 
question. A few examples will illustrate these new concepts. 
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2.1 Examples 
The Cantor set 

The first striking example consists of the Cantor set C, which was con- 
structed in Chapter 1 by successively removing the middle-third intervals 
in [0,1], 

Theorem 2.1 The Cantor set C has striet HausdorjJ dimension a = 
log 2/ log 3. 

The inequality 

ma{C) < 1 

follows from the construction of C and the definitions. Indeed, recall from 
Chapter 1 that C = f] Cfe, where each Ck is a finite union of 2^ intervals of 
length 3“^. Given > 0, we first choose K so large that 3“-^ < 5. Since 
the set Ck covers C and consists of 2^ intervals of diameter 3“^ < 5, 
we must have 

niiC) < 2 ^( 3 “-^)“. 

However, a satisfies precisely 3“ = 2, hence 2^(3“^)" = 1, and therefore 

ma{C) < 1 . 

The reverse inequality, which consists of proving that 0 < ma{C), re- 
quires a further idea. Here we rely on the Cantor-Lebesgue function, 
which maps C surjectively onto [0, 1]. The key fact we shall use about 
this function is that it satisfies a precise continuity condition that reflects 
the dimension of the Cantor set. 

A function / defined on a subset E of satisfies a Lipschitz con- 
dition on E if there exists M > 0 such that 

\f{x) - f{y)\<M\x-y\ for all x, 2/ e E. 

More generally, a function / satisfies a Lipschitz condition with ex- 
ponent 7 (or is Holder 7) if 

\f{x)- f{y)\<M\x-y\^ for all x, y e E. 

The only interesting case is when 0 < 7 < 1. (See Exercise 3.) 

Lemma 2.2 Suppose a function f defined on a compact set E satisfies 
a Lipschitz condition with exponent 7. Then 

(i) rnp{f{E)) < Mf^ma{E) if (3 = a/7. 
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(ii) dim f{E) < 

Proof. Suppose {Fk} is a countable family of sets that covers E. 
Then {f{EnFk)} covers f{E) and, moreover, f{EnFk) has diameter 
less than M(diam Fk)^. Hence 


^(diam f{E n ^^(diam 


k 


k 


and part (i) follows. This result now immediately implies conclusion (ii). 


Lemma 2.3 The Cantor-Lebesgue function F on C satisfies a Lipschitz 
condition with exponent 7 = log 2/ log 3. 

Proof The function F was constructed in Section 3.1 of Chapter 3 as 
the limit of a sequence {F„} of piecewise linear functions. The function 
Fn increases by at most 2 “"' on each interval of length 3“”. So the slope 
of Fn is always bounded by (3/2)”, and hence 



Moreover, the approximating sequence also satisfies \F{x) — Fn{x)\ < 
1/2”. These two estimates together with an application of the triangle 
inequality give 


\F{x) - F{y)\ < \Fn{x) - Fn{y)\ + \F{x) - Fn{x)\ + \F{y) - F„(y)| 



Having fixed x and y, we then minimize the right hand side by choosing 
n so that both terms have the same order of magnitude. This is achieved 
by taking n so that 3”|a; — y\ is between 1 and 3. Then, we see that 


\F{x) - F{y)\ < c2-” = c(3-”)^ < M\x - yfi 


since 3^ = 2 and 3 ” is not greater than |a: — y\. This argument is re- 
peated in Lemma 2.8 below. 

With E = C, f the Cantor-Lebesgue function, and 0 = 7 = log 2/ log 3, 
the two lemmas give 


"ii([0,l]) < M^ma{C). 
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Thus ma{C) > 0, and we find that dimC = log 2/ log 3. 

The proof of this example is typical in the sense that the inequal- 
ity ma{C) < oo is usually easier to obtain than 0 < ma[C). Also, with 
some extra effort, it is possible to show that the log 2/ log 3-dimensional 
Hausdorff measure of C is precisely 1 . (See Exercise 7. ) 

Rectifiable curves 

A further example of the role of dimension comes from looking at con- 
tinuous curves in Recall that a continuous curve 7 : [a, b] — > is 
said to be simple if 7 (^ 1 ) 7 ^ 7 (^ 2 ) whenever ti 7 ^ t 2 , and quasi-simple 
if the mapping 1 1 — > z{t) is injective for t in the complement of finitely 
many points. 

Theorem 2.4 Suppose the curve 7 is continuous and quasi-simple. Then 
7 is rectifiable if and only ifT = {j{t) : a < t < b} has strict Hausdorff 
dimension one. Moreover, in this case the length of the curve is precisely 
its one- dimensional measure mi(r). 

Proof. Suppose to begin with that T is a rectifiable curve of length L, 
and consider an arc-length parametrization 7 such that T = {f{t) : 0 < 
t < L}. This parametrization satisfies the Lipschitz condition 

- 7 (^ 2 )! < \tl - t2\. 

This follows since \ti — t 2 \ is the length of the curve between ti and t 2 , 
which is greater than the distance from f{ti) to 7(^2)- Since 7 satisfies 
the conditions of Lemma 2.2 with exponent 1 and M = 1, we find that 


mi(r) < L. 


To prove the reverse inequality, we let a = to < < ' ' ' < ^ denote 

a partition of [a, b] and let 

Tj = {7(t) -.tj <t< tj+i}, 

so that T = hence 


N-l 

™i(r) = rni{Tj) 

3=0 

by an application of Property 4 of the Hausdorff measure and the fact 
that T is quasi-simple. Indeed, by removing finitely many points the 
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union becomes disjoint, while the points removed clearly have 

zero mi-measure. We next claim that mi(rj) > ij, where £j is the dis- 
tance from 7 (tj) to 7 (tj+i), that is, ij = | 7 (tj_|-i) — To see this, 

recall that Hausdorff measure is rotation- invariant, and introduce new or- 
thogonal coordinates x and y such that -|- 1)] is the segment 

on the x-axis. The projection 7r(x,y) = x satisfies the Lipschitz 
condition 


|7r(P)-7r(g)| < \P-Q\, 

and clearly the segment [0, ij] on the x-axis is contained in the image 
7r(rj). Therefore, Lemma 2.2 guarantees 

ij < mi{Tj), 

and thus mi(r) > '^ij- Since by definition the length L of T is the 
supremum of the sums '^ij over all partitions of [a,b], we find that 
"ii(r) > L, as desired. 

Conversely, if T has strict Hausdorff dimension 1, then mi(r) < oo, 
and the above argument shows that T is rectifiable. 

The reader may note the resemblance of this characterization of rec- 
tifiability and an earlier one in terms of Minkowski content, given in 
Chapter 3. In this connection we point out that there is a different 
notion of dimension that is sometimes used instead of Hausdorff dimen- 
sion. For a compact set E, this dimension is given in terms of the size 
of = {x € : d{x, E) < 5} as 6 ^ 0. One observes that if P is a 

A:-dimensional cube in then m(E^) < as 5 — > 0, with m the 

Lebesgue measure of M'^. With this in mind, the Minkowski dimen- 
sion of E is defined by 

inf {(3 : m{E^) = as d ^ 0}. 

One can show that the Hausdorff dimension of a set does not exceed its 
Minkowski dimension, but that equality does not hold in general. More 
details may be found in Exercises 17 and 18. 

The Sierpinski triangle 

A Cantor-like set can be constructed in the plane as follows. We begin 
with a (solid) closed equilateral triangle Sq, whose sides have unit length. 
Then, as a first step we remove the shaded open equilateral triangle 
pictured in Figure 1. 
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This leaves three closed triangles whose union we denote by Si. Each 
triangle is half the size of the original (or parent) triangle So, and these 
smaller closed triangles are said to be of the first generation: the trian- 
gles in Si are the children of the parent Sq. In the second step, we repeat 
the process in each triangle of the first generation. Each such triangle 
has three children of the second generation. We denote by S 2 the union 
of the three triangles in the second generation. We then repeat this pro- 
cess to find a sequence S^ of compact sets which satisfy the following 
properties: 

(a) Each Sk is a union of 3^ closed equilateral triangles of side length 
2“*. (These are the triangles of the generation.) 

(b) {Sk} is a decreasing sequence of compact sets; that is, Sk+i C Sk 
for all fc > 0. 

The Sierpinski triangle is the compact set defined by 

CXD 

S=f]Sk. 

k=0 

Theorem 2.5 The Sierpinski triangle S has striet Hausdorff dimension 
a = log 3/ log 2. 

The inequality ma{S) < 1 follows immediately from the construction. 
Given <5 > 0, choose K so that 2~^ < 6. Since the set Sk covers S and 
consists of 3^ triangles each of diameter 2~^ < 5, we must have 

ni{s) < 3 ^( 2 --^)“. 

But since 2“ = 3, we find < 1, hence ma{S) < 1. 

The inequality ma{S) > 0 is more subtle. For its proof we need to fix 
a special point in each triangle that appears in the construction of S. 
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We choose to call the lower left vertex of a triangle the vertex of that 
triangle. With this choice there are 3^ vertices of the generation. 
The argument that follows is based on the important fact that all these 
vertices belong to S. 

Suppose S C U^i with diam Fj < S. We wish to prove that 
y^(diam Tj)“ > c > 0 

3 

for some constant c. Clearly, each Fj is contained in a ball of twice the 
diameter of Fj, so upon replacing 25 by 5 and noting that S is compact, 
it suffices to show that if 5 C U^i where B = {Bj}^^-^ is a finite 
collection of balls whose diameters are less than 5, then 

N 

y^(diam Bj)°‘ > c > 0. 
i=i 

Suppose we have such a covering by balls. Consider the minimum diam- 
eter of the Bj, and choose k so that 

2-^ < min diam B, < 2“''+^ 
l<j<N 

Lemma 2.6 Suppose B is a ball in the eovering B that satisfies 

2~^ < diam B < for some £ < k. 

Then B eontains at most c2>^~^ vertiees of the k^^ generation. 

In this chapter, we shall continue use the common practice of denoting 
by c,c',... generic constants whose values are unimportant and may 
change from one usage to another. We also use B to denote that 
the quantities A and B are comparable, that is, cB < A < c'B, for 
appropriate constants c and c' . 

Proof of Lemma 2.6. Let B* denote the ball with same center as B but 
three times its diameter, and let be a triangle of the k^'^ generation 
whose vertex v lies in B. If denotes the triangle of the £2“^ generation 
that contains Aj,, then since diam B > 2~^, 

•c e Afe C A^ C B*, 


as shown in Figure 2. 

Next, there is a positive constant c such that B* can contain at most 
c distinct triangles of the generation. This is because triangles of the 
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£th generation have disjoint interiors and area equal to c'4“^, while B* 
has area at most equal to c"4“^. Finally, each contains 3*“^ triangles 
of the generation, hence B can contain at most c3^“^ vertices of 
triangles of the generation. 

To complete the proof that ^ 3 )°‘ > c > 0, note that 

N 

^(diam Bj)^ > ^ ^^2“^“, 
i=i (■ 

where denotes the number of balls in B that satisfy 2~^ < diam Bj < 
2“^+^. By the lemma, we see that the total number of vertices of triangles 
in the generation that can be covered by the collection B can be no 
more than Since all 3^ vertices of triangles in the k^^ 

generation belong to <S, and all vertices of the generation must be 
covered, we must have > 3^. Hence 

> c. 
i 

It now suffices to recall the definition of a which guarantees 2“^“ = 3“^, 
and therefore 

N 

y^(diam Bj)‘^ > c, 

j=i 
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as desired. 

We give a final example that exhibits properties similar to the Cantor 
set and Sierpinski triangle. It is the curve discovered by von Koch in 1904. 


The von Koch curve 

Consider the unit interval Kq = [0, 1], which we may think of as lying 
on the x-axis in the xy-plane. Then consider the polygonal path Ki 
illustrated in Figure 3, which consists of four equal line segments of 
length 1/3. 





Figure 3. The first few stages in the construction of the von Koch curve 


Let Ki(t), for 0 < t < 1, denote the parametrization of Ki that has 
constant speed. In other words, as t travels from 0 to 1/4, the point 
Ki{t) travels on the first line segment. As t travels from 1/4 to 1/2, the 
point Kiit) travels on the second line segment, and so on. In particular, 
we see that iLi(^/4) for 0 < £ < 4 correspond to the five vertices of Ki. 

At the second stage of the construction we repeat the process of re- 
placing each line segment in stage one by the corresponding polygonal 
line. We then obtain the polygonal curve K 2 illustrated in Figure 3. It 
has 16 = 4^ segments of length 1/9 = 3~^. We choose a parametrization 
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K 2 {t) (0 < t < 1) of K 2 that has constant speed. Observe that K2{i/4P‘) 
for 0 < < 4^ gives all vertices of K 2 , and that the vertices of Ki belong 

to K 2 , with 

K2{i/4.) = Ki{i/A) for0<^<4. 

Repeating this process indefinitely, we obtain a sequence of continuous 
polygonal curves {Kj}, where Kj consists of 4-^ segments of length 3^-^ 
each. If Kj{t) (0 < f < 1) is the parametrization of Kj that has constant 
speed, then the vertices are precisely at the points Kj{i/A^), and 

Kj>{i/A^) = Kj{£/A^) for 0 < ^ < 4^ 


whenever j' > j. 

In the limit as j tends to infinity, the polygonal lines Kj tend to the 
von Koch curve fC. Indeed, we have 

l-^j+i(^) “ ^ 3“-^ for all 0 < t < I and j > 0. 

This is clear when j = 0, and follows by induction in j when we consider 
the nature of the construction of the stage. Since we may write 


j-i 

Kj{t) = K,{t) + - Kj{t)), 

j=i 

the above estimate proves that the series 

00 

K,{t) + J2{KjMt)-Kj{t)) 

i=i 

converges absolutely and uniformly to a continuous function lC{t) that is 
a parametrization of 1C. Besides continuity, the function lC{t) satisfies a 
regularity assumption that takes the form of a Lipschitz condition, as in 
the case of the Cantor-Lebesgue function. 

Theorem 2.7 The function IC{t) satisfies a Lipschitz condition of expo- 
nent 7 = log 3/ log 4, that is: 

\lC{t) — /C(s)| < M\t — s|^ for all t,s a [0, 1]. 

We have already observed that \Kjj.i{t) — Kj{t)\ < . Since Kj travels 

a distance of 3“'’ in 4“-^ units of time, we see that 

j 

except when t = Ij A? . 


K'M < (^) 
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Consequently we must have 


K,{t) 


K,{s)\ < 



s 


Moreover, K,{t) = Ki{t) + ~ We now find our- 

selves in precisely the same situation as in the proof that the Cantor- 
Lebesgue function satisfies a Lipschitz condition with exponent log 2/ log 3. 
We generalize that argument in the following lemma. 

Lemma 2.8 Suppose {fj} is a sequence of continuous functions on the 
interval [ 0 , 1 ] that satisfy 

l/i(0 “ /j(s)l — ^'^1^ “ '^l some A > 1, 

and 

- fj+i{t)\ < for some B > 1. 

Then the limit f{t) = limj^oo fj{t) exists and satisfies 
\f{t)-f{s)\<M\t-s\\ 


where 7 = log B / \og{AB) . 

Proof. The continuous limit / is given by the uniformly convergent 
series 

00 

fit) = flit) + “ fkit))^ 

k=l 


and therefore 


\fit) - f,it)\ < Y, \fk+iit) - Mt)\ <Yb~’^^ cB-p 

k=j k=j 

The triangle inequality, an application of the inequality just obtained, 
and the inequality in the statement of the lemma give 

\fit) - fis)\ < \m - Ms)\ + \{f- m)\ + \{f- f,){s)\ 

< c{A^\t — s| + B~^). 

For a fixed pair of numbers t and s with t ^ s, we choose j to minimize 
the sum Ai\t — s\ + B^f This is essentially achieved by picking j so that 
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two terms A^\t — s\ and B ^ are comparable. More precisely, we choose 
a j that satisfies 

{ABy\t — s| < 1 and 1 < {ABy^^\t — s|. 

Since |t — s| < 2 and AB > 1, such a j must exist. The first inequality 
then gives 

Ayt-s\ < B-\ 

while raising the second inequality to the power 7 , and using the fact 
that {ABy = B gives 

1 < Byt-sy. 

Thus B~^ < — s|^, and consequently 

\f{t) - f{s)\ < ciAyt - s| + B-y < M\t - sp, 


as was to be shown. 


In particular, this result with Lemma 2.2 implies that 


dim/C < - 

7 


log 4 
logs' 


To prove that m^{IC) > 0 and hence dimAl = log 4/ log 3 requires an ar- 
gument similar to the one given for the Sierpinski triangle. In fact, 
this argument generalizes to cover a general family of sets that have a 
self-similarity property. We therefore turn our attention to this general 
theory next. 

Remarks. We mention some further facts about the von Koch curve. 
More details can be found in Exercises 13, 14, and 15 below. 


1. The curve K. is one in a family of similarly constructed curves. For 
each £, 1/4 < ^ < 1/2, consider at the first stage the curve Kf{t) 
given by four line segments each of length £, the first and last on the 
x-axis, and the second and third forming two sides of an isoceles 
triangle whose base lies on the x-axis. (See Figure 4.) The case 
^ = 1/3 corresponds to the previously defined von Koch curve. 

Proceeding as in the case £ = 1/3, one obtains a curve K,^, and it 
can be seen that 

log 4 
log 1 /^ 


dim(/C^) 
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Thus for every a, 1 < a < 2, we have a curve of this kind of dimen- 
sion a. Note that when — > 1/4 the limiting curve is a straight line 
segment, which has dimension 1. When ^ > 1/2, the limit can be 
seen to correspond to a “space-filling” curve. 

2. The curves 1 1 — > 1/4 < I’ < 1/2, are each nowhere differen- 

tiable. One can also show that each curve is simple when 1/4 < 

e < 1 / 2 . 

2.2 Self-similarity 

The Cantor set C, the Sierpinski triangle S, and von Koch curve K, all 
share an important property: each of these sets contains scaled copies 
of itself. Moreover, each of these examples was constructed by iterating 
a process closely tied to its scaling. For instance, the interval [0,1/3] 
contains a copy of the Cantor set scaled by a factor of 1/3. The same is 
true for the interval [2/3, 1], and therefore 

C = Cl U C 2 , 

where Ci and C 2 are scaled versions of C. Also, each interval [0,1/9], 
[2/9, 3/9], [6/9, 7/9] and [8/9, 1] contains a copy of C scaled by a factor 
of 1/9, and so on. 

In the case of the Sierpinski triangle, each of the three triangles in the 
first generation contains a copy of S scaled by the factor of 1/2. Hence 

5 = <Si U 52 U 53 , 

where each Sj^ j = 1,2,3, is obtained by scaling and translating the 
original Sierpinski triangle. More generally, every triangle in the 
generation is a copy of S scaled by the factor of 1/2^. 

Finally, each line segment in the initial stage of the construction of the 
von Koch curve gives rise to a scaled and possibly rotated copy of the 
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von Koch curve. In fact 


1C — Afi U IC 2 U A^3 U A1^4, 

where ICj, j = 1, 2, 3, 4, is obtained by scaling 1C by the factor of 1/3 and 
translating and rotating it. 

Thus these examples each contain replicas of themselves, but on a 
smaller scale. In this section, we give a precise definition of the resulting 
notion of self-similarity and prove a theorem determining the Hausdorff 
dimension of these sets. 

A mapping S' : is said to be a similarity with ratio r > 0 if 

\S{x) - S{y)\ = r\x - y\. 

It can be shown that every similarity of is the composition of a trans- 
lation, a rotation, and a dilation by r. (See Problem 3.) 

Given finitely many similarities Si, ... , Sm with the same ratio r, we 
say that the set F is self-similar if 

F = S,{F)U---USm{F). 

We point out the relevance of the various examples we have already seen. 
When F = C is the Cantor set, there are two similarities given by 

Si(x) = x/3 and S 2 (a;) = a;/3 -|- 2/3 

of ratio 1/3. So m = 2 and r = 1/3. 

In the case of F = S, the Sierpinski triangle, the ratio is r = 1/2 and 
there are m = 3 similarities given by 

S 2 {x)=^ + a and S 3 (a;) = ^ -f /3. 

Here, a and (3 are the points drawn in the first diagram in Figure 5. 

If F = /C, the von Koch curve, we have 

5'i(a:) = |, S2{x) = p^ + a, S3(x) = -f /3, 


Sa{x) 


X 

3 


+ 7, 


and 
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Figure 5. Similarities of the Sierpinski triangle and von Koch curve 


where p is the rotation centered at the origin and of angle 7r/3. There 
are m = 4 similarities which have ratio r = 1/3. The points a, (3, and 7 
are shown in the second diagram in Figure 5. 

Another example, sometimes called the Cantor dust V, is another 
two-dimensional version of the standard Cantor set. For each fixed 0 < 
p < 1/2, the set V may be constructed by starting with the unit square 
Q = [0, 1] X [0, 1]. At the first stage we remove everything but the four 
open squares in the corners of Q that have side length p. This yields a 
union Di of four squares, as illustrated in Figure 6. 



Figure 6. Construction of the Cantor dust 


We repeat this process in each sub-square of Hi; that is, we remove 
everything but the four squares in the corner, each of side length 
This gives a union D 2 of 16 squares. Repeating this process, we obtain 
a family Di D D 2 D ■ ■ ■ D D ■ ■ ■ of compact sets whose intersection 
defines the Cantor dust corresponding to the parameter p. 
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There are here m = 4 similarities of ratio fi given by 

Si{x) = iJ,x^ 

S2{x) = ixx + {0,1 - n), 

Ssix) = /tx + (1 - /i, 1 - ^), 

S4{x) = nx + {1 — ^,0). 


It is to be noted that V is the product Cg x with the Cantor set 
of constant dissection as defined in Exercise 3, of Chapter 1. Here 
e = 1 - 2/t. 

The first result we prove guarantees the existence of self-similar sets 
under the assumption that the similarities are contracting, that is, that 
their ratio satisfies r < 1. 

Theorem 2.9 Suppose 81 , 82 , ■■■ , Sm are m similartities, eaeh with the 
same ratio r that satisfies 0 < r < 1. Then there exists a unique non- 
empty eompaet set F sueh that 

F = S4{F)U---USm{F). 

The proof of this theorem is in the nature of a fixed point argument. 
We shall begin with some large ball B and iteratively apply the mappings 
Si, ... , Sm- The fact that the similarities have ratio r < 1 will suffice to 
imply that this process contracts to a unique set F with the desired 
property. 

Lemma 2.10 There exists a closed ball B so that Sj{B) C B for all 
j = l,...,m. 

Proof. Indeed, we note that if 5 is a similarity with ratio r, then 

\S{x)\<\S{x)-S{0)\ + \S{0)\ 

< r|x| -h |5'(0)|. 

If we require that \x\ < R implies \S{x)\ < R, it suffices to choose R 
so that rR-\- |S'(0)| < R, that is, i? > |S'(0)|/(1 — r). In this fashion, 
we obtain for each Sj a ball Bj centered at the origin that satisfies 
Sj{Bj) C Bj. If B denotes the ball among the Bj with the largest radius, 
then the above shows that Sj{B) C B for all j. 

Now for any set A, let 8 {A) denote the set given by 
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Note that if A C A' , then S{A) C S{A'). 

Also observe that while each Sj is a mapping from to R'*, the 
mapping S is not a point mapping, but takes subsets of R'^ to subsets of 
R"*. 

To exploit the notion of contraction with a ratio less than 1 , we intro- 
duce the distance between two compact sets as follows. For each <5 > 0 
and set A, we let 

A^ = {x : d{x,A) < (5}. 

Hence A^ is a set that contains A but which is slightly larger in terms of <5. 
If A and B are two compact sets, we define the Hausdorff distance as 

dist(A, H) = inf{(I : B C A^ and A C B^}. 

Lemma 2.11 The distance function dist defined on compact subsets of 
R'^ satisfies 

(i) dist(A, B) = 0 if and only if A = B. 

(ii) dist(A, H) = dist{B,A). 

(in) dist(A, B) < dist(A, C) + dist(C', B). 

If Si, , Sm are similarities with ratio r, then 
(iv) dist {S{ A), S{B)) < rdist(A, H). 

The proof of the lemma is simple and may be left to the reader. 

Using both lemmas we may now prove Theorem 2.9. We first choose 
B as in Lemma 2.10, and let = S^{B), where 5^ denotes the com- 
position of S, that is, 5^ = 5^“^ o S with 5^ = S. Each is compact, 
non-empty, and Fk C Fk-i, since S{B) C B. If we let 

OO 

= n 

k=l 

then F is compact, non-empty, and clearly S{F) = F, since applying S 
to n^i yields n ^2 ^k, which also equals F. 

Uniqueness of the set F is proved as follows. Suppose G is another 
compact set so that S{G) = G. Then, an application of part (iv) in 
Lemma 2.11 yields dist(F, G) < rdist(F, G). Since r < 1, this forces 
dist(F, G) = 0, so that F = G, and the proof of Theorem 2.9 is com- 
plete. 
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Under an additional technical condition, one can calculate the precise 
Hausdorff dimension of the self-similar set F. Loosely speaking, the 
restriction holds if the sets -S'i(F), . . . , Sm{F) do not overlap too much. 
Indeed, if these sets were disjoint, then we could argue that 

m 

ma{F) = 

i=i 

Since each Sj scales by r, we would then have ma{Sj{F)) = r^ma{F). 
Hence 


ma{F) = mr°'ma{F). 

If ma{F) were finite, then we would have that mr“ = 1; thus 

log m 
^ log 1 /r 

The restriction we impose is as follows. We say that the similarities 
Si, , Sm are separated if there is an bounded open set O so that 

Si{0)yj---yjSm{0), 

and the Sj{0) are disjoint. It is not assumed that O contains F. 

Theorem 2.12 Suppose Si,S 2 , • . • , Sm are m separated similarities with 
the common ratio r that satisfies 0 < r < I. Then the set F has Haus- 
dorff dimension equal to logm/log(l/r). 

Observe first that when F is the Cantor set we may take O to be 
the open unit interval, and note that we have already proved that its 
dimension is log 2 / log 3. For the Sierpinski triangle the open unit triangle 
will do, and dim 5 = log 3/ log 2. In the example of the Cantor dust the 
open unit square works, and dimD = log m/ log Finally, for the von 
Koch curve we may take the interior of the triangle pictured in Figure 7, 
and we will have dimAl = log 4/ log 3. 

We now turn to the proof of Theorem 2.12, which will follow the same 
approach used in the case of the Sierpinski triangle. If a = log m/ log(l/r), 
we claim that mffF) < oo, hence dimF < ol. Moreover, this inequality 
holds even without the separation assumption. Indeed, recall that 


Fk = S^{B) 
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Figure 7. Open set in the separation of the von Koch similarities 


and S^{B) is the union of sets of diameter less than cr^ (with c = 
diam B), each of the form 

5'ni ° o ■ ■ ■ o Snk{B), where 1 < Ui < m and 1 < * < fc. 
Consequently, if cr^ < 5, then 

ni{F)< Y. (diam5„,o...oS„,(i?))“ 

<c', 

since mr“ = 1, because a = logm/log(l/r). Since c' is independent of 
(5, we get ma{F) < c' . 

To prove ma{F) > 0, we now use the separation condition. We argue 
in parallel with the earlier calculation of the Hausdorff dimension of the 
Sierpinski triangle. 

Fix a point x in F. We define the “vertices” of the generation as 
the points that lie in F and are given by 

Sni o ■ ■ ■ o Sn,, (x), where 1 < ni < m, . . . , 1 < < m. 

Each vertex is labeled by (ni, . . . ,nk)- Vertices need not be distinct, so 
they are counted with their multiplicities. 

Similarly, we define the “open sets” of the generation to be the 
sets given by 

Srn o ■ ■ ■ o Sn^(O), where 1 < ni < m, . . . , 1 < < m, 

and where O is fixed and chosen to satisfy the separation condition. 
Such open sets are again labeled by multi-indices (ni,n 2 , . . . ,nfe) with 
1 < Uj < m, 1 < j < k. 
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Then the open sets of the generation are disjoint, since those of 
the first generation are disjoint. Moreover k> each open set of the 
£th generation contains open sets of the generation. 

Suppose z; is a vertex of the k^'^ generation, and let 0{v) denote the 
open set in the generation which is associated to v, that is, v and 
0{v) carry the same label (rii, 77 - 2 , . . . , rife). Since x is at a fixed distance 
from the original open set O, and O has a finite diameter, we find that 

(a) d{v, 0{v)) < cr^. 

(b) c'r^ < diam 0{v) < cr^ . 

As in the case of the Sierpinski triangle, it suffices to prove that if 
B = is a finite collection of balls whose diameters are less than 

5 and whose union covers T, then 

N 

y^(diam Bj)^ > c > 0. 
t=i 

Suppose we have such a covering by balls, and choose k so that 

< min diam Bj < . 

i<i<N 

Lemma 2.13 Suppose B is a ball in the covering B that satisfies 

< diam B < for some i < k. 

Then B contains at most cm^~^ vertices of the k^'^ generation. 

Proof. If z; is a vertex of the k^^ generation with v a B, and 0{v) 
denotes the corresponding open set of the k^^ generation, then, for some 
fixed dilate B* of B, properties (a) and (b) above guarantee that 0{v) C 
B* , and B* also contains the open set of generation i that contains 0{v). 

Since B* has volume and each open set in the generation has 
volume « r'^^ (by property (b) above), B* can contain at most c open 
sets of generation i. Hence B* contains at most cm^~^ open sets of the 
/jth generation. Consequently, B can contain at most cm^~^ vertices of 
the generation, and the lemma is proved. 

For the final argument, let Ni denote the number of balls in B so that 
r^ < diam Bj < r^~^ . 

By the lemma, we see that the total number of vertices of the kf'^ gen- 
eration that can be covered by the collection B can be no more than 
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Since all vertices of the generation belong to F, 
we must have Nem^~^ > m^, and hence 

Nim~^ > c. 
i 

The definition of a gives and therefore 

N 

yy(diam -Bj)“ > > c, 

j=i t 

and the proof of Theorem 2.12 is complete. 

3 Space-filling curves 

The year 1890 heralded an important discovery: Peano constructed a 
continuous curve that filled an entire square in the plane. Since then, 
many variants of his construction have been given. We shall describe here 
a construction that has the feature of elucidating an additional significant 
fact. It is that from the point of measure theory, speaking broadly, the 
unit interval and unit square are “isomorphic.” 

Theorem 3.1 There exists a eurve t F{t) from the unit interval to 
the unit square with the following properties: 

(i) V maps [0, 1] to [0, 1] x [0, 1] continuously and surjectively. 

(ii) V satisfies a Lipschitz condition of exponent 1 /2, that is, 

\V{t)-V{s)\ 

(in) The image under V of any sub-interval [a, b] is a compact subset of 
the square of (two-dimensional) Lebesgue measure exactly b — a. 

The third conclusion can be elaborated further. 

Corollary 3.2 There are subsets Z\ C [0, 1] and Z-i C [0, 1] x [0, 1], each 
of measure zero, such that V is bijective from 

[0, 1] - Zi to [0, 1] X [0, 1] - Z 2 

and measure preserving. In other words, E is measurable if and only if 
'P{E) is measurable, and 


mi{E) = m 2 {V{E)). 
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Here mi and m2 denote the Lebesgue measures in and respec- 
tively. 

We shall call the function t^V{t) the Peano mapping. Its image 
is called the Peano curve. 

Several observations help clarify the nature of the conclusions of the 
theorem. Suppose that F : [0, 1] — > [0, 1] x [0, 1] is continuous and sur- 
jective. Then: 

(a) F cannot be Lipschitz of exponent 7 > 1/2. This follows at once 
from Lemma 2.2, which states that 

dim F([0,1]) < -dim[0,l], 

7 

so that 2 < 1/7 as desired. 

(b) F cannot be injective. Indeed, if this were the case, then the in- 
verse G oi F would exist and would be continuous. Given any two 
points a 7^ 6 in [0, 1], we would get a contradiction by looking at 
two distinct curves in the square that join F{a) and F{h), since the 
image of these two curves under G would have to intersect at points 
between a and h. In fact, given any open disc D in the square, there 
always exists x e D so that F{t) = F{s) = x yet t ^ s. 

The proof of Theorem 3.1 will follow from a careful study of a natu- 
ral class of mappings that associate sub-squares in [0, 1] x [0, 1] to sub- 
intervals in [0,1]. This implements the approach implicit in Hilbert’s 
iterative procedure, which he set forth in the first three stages in Fig- 





Figure 8. Construction of the Peano curve 


We turn now to the study of the general class of mappings. 
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3.1 Quartic intervals and dyadic squares 

The quartic intervals arise when [0, 1] is successively sub-divided by 
powers of 4. For instance, the first generation quartic intervals are the 
closed intervals 

/i = [0,1/4], /2 = [l/4,l/2], /3 = [l/2,3/4], /4 = [3/4,1], 

The second generation quartic intervals are obtained by sub-dividing each 
interval of the first generation by 4. Hence there are 16 = 4^ quartic in- 
tervals of the second generation. In general, there are 4* quartic intervals 
of the generation, each of the form [^, ^t^], where i is integral with 
0 < ^ < 4 '=. 

A chain of quartic intervals is a decreasing sequence of intervals 

D D ■ ■ ■ D D ■ ■ ■ , 

where is a quartic interval of the generation (hence = 4“^). 

Proposition 3.3 Chains of quartic intervals satisfy the following prop- 
erties: 

(i) If is a chain of quartic intervals, then there exists a unique 

t G [0, 1] such that t G . 

(ii) Conversely, given t G [0,1], there is a chain {/^} of quartic inter- 
vals such that t G Hfe I^ ■ 

(in) The set of t for which the chain in part (ii) is not unique is a set 
of measure zero (in faet, this set is eountable). 

Proof Part (i) follows from the fact that {/^} is a decreasing sequence 
of compact sets whose diameters go to 0. 

For part (ii), we fix t and note that for each k there exists at least one 
quartic interval with t Q I^. If t is of the form £ /4^ , where 0 < £ < 
then there are exactly two quartic intervals of the k^^ generation that 
contain t. Hence, the set of points for which the chain is not unique is 
precisely the set of dyadic rationals 

£ 

—r, where 1 < fc, and 0 < ^ < 4^. 

Note that of course, these fractions are the same as those of the form 
t /2^ with 0 < ^' < 2^ . This set is countable, hence has measure 0. 
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It is clear that each chain {/^} of quartic intervals can be represented 
naturally by a string .0102 ■ ■ ■ ■ ■ ■ , where each ak is either 0,1,2, or 3. 

Then the point t corresponding to this chain is given by 

00 

_ O-k 
^ “ 2^ 4fe' 
k=l 

The points where ambiguity occurs are precisely those where aj, = 3 for 
all sulRciently large k, or equivalently where Ofe = 0 for all sufficiently 
large k. 

Part of our description of the Peano mapping will follow from associ- 
ating to each quartic interval a dyadic square. These dyadic squares 
are obtained by sub-dividing the unit square [0, 1] x [0, 1] in the plane by 
successively bisecting the sides. 

For instance, dyadic squares of the first generation arise from bisecting 
the sides of the unit square. This yields four closed squares 81,82,33 
and S'4, each of side length 1/2 and area \ 3 i\ = 1/4, for * = 1, ... ,4. 

The dyadic squares of the second generation are obtained by bisecting 
each dyadic square of the first generation, and so on. In general, there 
are 4^ squares of the k^^ generation, each of side length 1/2^ and area 
l/A^. 

A chain of dyadic squares is a decreasing sequence of squares 
3 ^ D 3 “^ D ■■■ D 3’" D ■■■ , 
where 8 ^ is a dyadic square of the generation. 

Proposition 3.4 Chains of dyadic squares have the following proper- 
ties: 

(i) If { 3 ^^} is a chain of dyadic squares, then there exists a unique 
X € [0, 1] X [0, 1] sueh that x G P|j. 5^. 

(ii) Conversely, given x G [0, 1] x [0, 1], there is a chain of dyadic 
squares such that x G 

(in) The set of x for whieh the ehain in part (ii) is not unique is a set 
of measure zero. 

In this case, the set of ambiguities consists of all points (xi, X2) where 
one of the coordinates is a dyadic rational. Geometrically, this set is 
the (countable) union of vertical and horizontal segments in [0, 1] x [0, 1] 
determined by the grid of dyadic rationals. This set has measure zero. 
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Moreover, each chain of dyadic squares can be represented by a string 
.6162 ■■■ , where each bk is either 0, 1, 2 or 3. Then 



( 1 ) 


k=l 


where 


h = (0, 0) if bk = 0, 
bk = (0, 1) if bk = 1, 

6fc = (l,0) if6fc = 2, 

6fe = (l,l) if = 3. 


3.2 Dyadic correspondence 

A dyadic correspondence is a mapping $ from quartic intervals to 
dyadic squares that satisfies: 

(1) $ is bijective. 

(2) $ respects generations. 

(3) $ respects inclusion. 

By (2), we mean that if / is a quartic interval of the generation, then 
$(/) is a dyadic square of the generation. By (3), we mean that if 
I CJ, then $(/) C $(J). 

For example, the trivial, or standard correspondence assigns to the 
string .0102 ■ ■ ■ the string .6162 ■ ■ ■ with bk = o^. 

Given a dyadic correspondence $, the induced mapping maps 
[0, 1] to [0, 1] X [0, 1] and is given as follows. If {t} = f]I^ where {/^} 
is a chain of quartic intervals, then, since {$(/^)} is a chain of dyadic 
squares, we may let 




= x = 


We note that ^>* is well-defined except on a (countable) set of measure 
zero, (those points t that are represented by more than one quartic chain.) 

A moment’s reflection will show that if I' is a quartic interval of the 
/jth generation, then the images $*(/') = {$*(t), t e /'}, comprise the 
dyadic square of the generation $(/'). Thus $*(/') = $(/'), and 
hence mi(/') = m2($*(/')). 


Theorem 3.5 Given a dyadic correspondence there exist sets Z\ C 
[0,1] and Z 2 C [0,1] x [0,1], each of measure zero, so that: 
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(i) 4>* is a bijection on [0, 1] — Zi to [0, 1] x [0, 1] — Z2. 

(ii) E is measurable if and only if^*{E) is measurable. 

(iii) mi{E) = m 2 {^*{E)). 

Proof. First, let A/i denote the collection of chains of those quartic 
intervals arising in (iii) of Proposition 3.3, those for which the points in 
/ = [0, 1] are not uniquely representable. Similarly, let A/2 denote the 
collection of chains of those dyadic squares for which the corresponding 
points in the square I x I are not uniquely representable. 

Since $ is a bijection from chains of quartic intervals to chains of dyadic 
squares, it is also a bijection from A/i U4>“^(A/2) to $(A/i) UA/2, and 
hence also of their complements. Let Zi be the subset of / consisting of 
all points in I that can be represented (according to (i) of Proposition 3.3) 
by the chains in A/i U4>“^(A/2), and let Z 2 be the set of points in the 
square that can be represented by dyadic squares in 4>(A/i) UA/2. Then 
$*, the induced mapping, is well-defined on I — Zi, and gives a bijection 
of / — Zi to {I X I) — Z 2 . To prove that both Zi and Z 2 have measure 
zero, we invoke the following lemma. We suppose {fk}^=i is a fixed given 
sequence, with each fk either 0, 1, 2, or 3. 

Lemma 3.6 Let 

00 

Eo = {x = ^^afe/4^, where Ok 7^ fk for all sufficiently large k}. 
k=l 

Then m{EQ) = 0. 

Indeed, if we fix r, then m({x : Or 7^ /j,}) = 3/4, and 

m{{x : arf=fr and a^+i fr+i}) = (3/4)^, etc. 

Thus m{{x : Ok 7^ fk, all k > r}) = 0, and Eq is a countable union of 
such sets, from which the lemma follows. 

There is a similar statement for points in the square S = I x I in terms 
of the representation (1). 

Note that as a result the set of points in / corresponding to chains in 
A/i form a set of measure zero. In fact, we may use the lemma for the 
sequence for which fk = I, for all k, since the elements of A/i correspond 
to sequences {ok} with = 0 for all sufficiently large k, or = 3 for 
all sufficiently large k. 

Similarly, the points in the square S corresponding to A/2 form a set of 
measure zero. To see this, take for example /fc = 1 for k odd, and fk = ‘^ 
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for k even, and note that A /2 corresponds to all sequences {ofe} where 
one of the following four exclusive alternatives holds for all sufficiently 
large k'. either Ok is 0 or 1; or Ok is 2 or 3; or ak is 0 or 2; or ak is 1 
or 3. By similar reasoning the points $~^(A/ 2 ) and $(A/i) form sets of 
measure zero in / and / x / respectively. 

We now turn to the proof that (which is a bijection from I — Zi 
to (/ X /) — Z 2 ) is measure preserving. For this it is useful to recall 
Theorem 1.4 in Chapter 1, whereby any open set O in the unit interval 
/ can be realized as a countable union U^i where each Ij is a closed 
interval and the Ij have disjoint interiors. Moreover, an examination of 
the proof shows that the intervals can be taken to be dyadic, that is, of the 
form [^/2-’, {i + l)/2^], for appropriate integers i and j. Further, such an 
interval is itself a quartic interval if j is even, j = 2k, or the union of two 
quartic intervals [(2^)/2^*, (2.^ + l)/2^^] and [(2^ + l)/2^^, (21’ + 2)/2^^], 
if j is odd, j = 2k — 1. Thus any open set in I can be given as a union of 
quartic intervals whose interiors are disjoint. Similarly, any open set in 
the square I x / is a union of dyadic squares whose interiors are disjoint. 

Now let E be any set of measure zero in I — Z^ and e > 0. Then we 
can cover E C (J^- Ij, where Ij are quartic intervals and mi{Ij) < e. 
Because $*(£’) C then 


m2{^*{E)) < Y,m2{^*{Ij)) = 

Thus ^*{E) is measurable and m 2 {^*{E)) = 0. Similarly, ($*)“^ maps 
sets of measure zero in (/ x /) — Z 2 to sets of measure zero in I. 

Now the argument above also shows that if O is any open set in I, 
then $*(C> — Zi) is measurable, and m2{^*{0 — Zi)) = mi{0). Thus 
this identity goes over to Gs sets in I. Since any measurable set differs 
from a Gs set by a set of measure zero, we see that we have established 
that m 2 {^*{E)) = mi{E) for any measurable subset of FI of / — Zi. The 
same argument can be applied to (4>*)“^, and this completes the proof 
of the theorem. 

The Beano mapping will be obtained as 4>* for a special correspon- 
dence $. 

3.3 Construction of the Peano mapping 

The particular dyadic correspondence we now present provides us with 
the steps to follow when tracing the approximations of the Peano curve. 
The main idea behind its construction is that as we go from one quartic 
interval in the generation to the next quartic interval in the same 
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generation, we move from a dyadic square of the generation to another 

square of the generation that shares a common side. 

More precisely, we say that two quartic intervals in the same generation 
are adjacent if they share a point in common. Also, two squares in the 
same generation are adjacent if they share a side in common. 

Lemma 3.7 There is a unique dyadie eorrespondenee $ so that: 

(i) If I and J are two adjaeent intervals of the same generation, then 
$(/) and ^{J) are two adjacent squares (of the same generation). 

(ii) In generation k, if /_ is the left-most interval and 1+ the right- 
most interval, then ^(/-) is the left-lower square and ^(/+) is the 
right-lower square. 

Part (ii) of the lemma is illustrated in Figure 9. 



Figure 9. Special dyadic correspondence 


Given a square S and its four immediate sub-squares, an acceptable 
traverse is an ordering of the sub-squares Si, S 2 , S 3 , and S 4 , so that 
Sj and Sj+i are adjacent for j = 1, 2, 3. With such an ordering, we note 
that if we color Si white, and then alternate black and white, the square 
S 3 is also white, while S 2 and 5*4 are black. The important point to 
remember is that if the first square in a traverse is white, then the last 
square is black. 

The key observation is the following. Suppose we are given a square 
S, and a side a of S. If Si is any of the immediate four sub-squares in 
S, then there exists a unique traverse Si, S 2 , S 3 , and 5*4 so that the last 
square S 4 has a side in common with a. With the initial square Si in 
the lower-left corner of S, the four possibilities which correspond to the 
four choices of cr, are illustrated in Figure 10. 

We may now begin the inductive description of the dyadic correspon- 
dence satisfying the conditions in the lemma. On quartic intervals of the 
first generation we assign the square Sj = ^{Ij), as pictured in Figure 11. 
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S 2 



3 

Si 



4 


S 2 


S 3 





Si 



4 


a 

a 



S 2 

S 3 

Si 

S 4 


Figure 11. Initial step of the correspondence 


Now suppose $ has been defined for all quartic intervals of generation 
less than or equal to k. We now write the intervals in generation k in 
increasing order as Ii,. . . ,I^k, and let Sj = We then divide /i 

into four quartic intervals of generation k + 1 and denote them by /ij, 
A, 2; 7i 3, and where the intervals are chosen in increasing order. 

Then, we assign to each interval a dyadic square = Sj of 

generation k + 1 contained in Si so that: 

(a) Sij is the lower- left sub-square of 

(b) Si, 4 touches the side that Si shares with S2, 

(c) Si,i, Si, 2, Si, 3, and Si, 4 is a traverse. 

This is possible, since the induction hypothesis guarantees that S2 is 
adjacent to Si. 
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This settles the assignments for the sub-squares of Si, so we now turn 
our attention to 82- Let l2,i, l2,2, ^2,3, and 12,4 denote the quartic 
intervals of generation A: -|- 1 in I2, written in increasing order. First, we 
take 52, 1 = ^(/2,i) to be the sub-square of S2 which is adjacent to 5i,4. 
This can be done because 5i,4 touches S2 by construction. Note that 
we leave Si from a black square (5i,4), and enter S2 in a white square 
(52 ,i). Since ^3 is adjacent to S2, we may now find a traverse 52,i, 82,2, 
82,3 and ^2,4 so that ^2,4 touches S3. 

We may then repeat this process in each interval Ij and square Sj, 
j = 3 , . . . ,4^. Note that at each stage the square 5j,i (the “entering” 
square) is white, while 5j,4 (the “exiting” square) is black. 

In the final step, the induction hypothesis guarantees that S^k is the 
lower-right corner square. Moreover, since S4k_i must be adjacent to 
54fc it must be either above it, or to the left of it, so we enter a square of 
the {k -|- 1)®* generation along an upper or left side. The entering square 
is a white square, and we traverse to the lower right corner sub-square 
of 8 4k , which is a black square. 

This concludes the inductive step, hence the proof of Lemma 3.7. 

We may now begin the actual description of the Peano curve. For each 
generation k we construct a polygonal line which consists of vertical and 
horizontal line segments connecting the centers of consecutive squares. 
More precisely, let denote the dyadic correspondence in Lemma 3.7, 
and let 5 i, . . . , 54* be the squares of the k*^ generation ordered according 
to 4>, that is, 4*(/j) = Sj. Let tj denote the middle point of Ij, 


^3 ~ 


qfc 


for j = 1 ,..., 4 '=. 


Let Xj be the center of the square Sj, and define 




Also set 


'Pfe(O) = (0, 1/2^+^) = xo where to = 0, 


and 


Vk{l) = (1, 1/2^"+^) = x^k+i where t^kj^i = 1. 

Then, we extend Vk{t) to the unit interval 0 < t < 1 by linearity along 
the sub- intervals determined by the division points toi • ■ • ) A4fc+i- 
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Note that the distance \xj — Xj+ij = 1/2^, while \tj — = 1/4^ for 

0 < j <4^. Also 


X\ ^0 1 — 1 ^ 4 ^ 


1 

2 - 2 '=’ 


while 

1^1 - to\ = 1^4*= - ^4^+1 1 = 2 . 4fc • 

Therefore V'j^{t) = = 2^ except when t = tj. 

As a result, 

\Vk{t)-Vk{s)\<2’^\t-s\. 

However, 

|Pfe+i(t)-Pfc(t)| < ^/2 2-^ 

because when ^/4^ <t<{i+ l)/4^, then Vk+iit) and Vkit) belong to 
the same dyadic square of generation k. 

Therefore the limit 

OO 

V{t) = hm Vk{t) = V,{t) + - V,{t) 

i=i 

exists, and defines a continuous function in view of the uniform conver- 
gence. By Lemma 2.8 we conclude that 

\V{t)-V{s)\ < 

and V satisfies a Lipschitz condition of exponent of 1/2. 

Moreover, each Vk{t) visits each dyadic square of generation k as t 
ranges in [0, 1]. Hence V is dense in the unit square, and by continuity 
we find that 1 1 — > V{t) is a surjection. 

Finally, to prove the measure preserving property of P, it suffices to 
establish V = ^* . 

Lemma 3.8 //4> is the dyadic correspondence in Lemma 3.7, then $*(t) = 
'P{t) for every 0 < t < 1. 

Proof First, we observe that $*(t) is unambiguously defined for 
every t. Indeed, suppose t G f]kl'^ and t G fjfe are two chains of 
quartic intervals; then and must be adjacent for sufficiently large 
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k. Thus $(/^) and $(J^) must be adjacent squares for all sufficiently 
large k. Hence 

k k 

Next, directly from our construction we have 

k 

This gives the desired conclusion. 

The argument also shows that V{I) = ^(/) for any quartic interval I. 
Now recall that any interval (a, b) can be written as |J^. Ij, where the Ij 
are quartic intervals with disjoint interiors. Because V{Ij) = these 

are then dyadic squares with disjoint interiors. Since P(a, h) = |J^ V{Ij)^ 
we have 

CXD CXD CXD 

m 2 {'P{a,b)) = = mi{a,b). 

j=i j=i j=i 

This proves conclusion (hi) of Theorem 3.1. The other conclusions hav- 
ing already been established, we need only note that the corollary is 
contained in Theorem 3.5. 

As a result, we conclude that t^V{t) also induces a measure pre- 
serving mapping from [0, 1] to [0, 1] x [0, 1]. This concludes the proof of 
Theorem 3.1. 

4* Besicovitch sets and regularity 

We begin by presenting a surprising regularity property enjoyed by all 
measurable subsets (of finite measure) of when d> 3. As we shall 
see, the fact that the corresponding phenomenon does not hold for d = 
2 is due to the existence of a remarkable set that was discovered by 
Besicovitch. A construction of a set of this kind will be detailed in 
Section 4.4. 

We first fix some notation. For each unit vector 7 on the sphere, 
7 e and each t e M we consider the plane Vt,^, which is defined 

as the {d — l)-dimensional affine hyperplane perpendicular to 7 and of 
“signed distance” t from the origin.^ The plane Vt^-y is given by 

Vt,^ = {x e : X ■ j = t}. 


^Note that there are two planes perpendicular to 7 and of distance |t| from the origin; 
this accounts for the fact that t may be either positive or negative. 
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We observe that each Vt,-y carries a natural {d — \) Lebesgue measure, 
denoted by md-i- In fact, if we complete 7 to an orthonormal basis 
ei, 62, ... , ed_i,7 of then we can write any x € in terms of the 
corresponding coordinates as x = xiei + X2e2 + ■ ■ ■ + x^y. When we set 
X G X R with (xi, . . . , x^-i) G Xd G R, then the mea- 
sure TJid-i on Vt,-y is the Lebesgue measure on This definition of 

rud-i is independent of the choice of orthonormal vectors ei, 62, . . . , e^-i, 
since Lebesgue measure is invariant under rotations. (See Problem 4, 
Chapter 2, or Exercise 26, Chapter 3.) 

With these preliminaries out of the way, we define for each subset 
E C R'^ the slice of E cut out by the plane Vt,-y as 

Lit, 7 = E n Pt,7. 

We now consider the slices as t varies, where E is measurable and 
7 is fixed. (See Figure 12.) 



We observe that for almost every t the set Et^-y is rrid-i measurable 
and, moreover, md-i{Et^-y) is a measurable function of t. This is a 
direct consequence of Fubini’s theorem and the above decomposition, 
jjd _ jjd-i X jjj fact, so long as the direction 7 is pre-assigned, not 
much more can be said in general about the function 1 1— > md-i{Et^^). 
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However, when d > 3 the nature of the function is dramatically different 
for “most” 7. This is contained in the following theorem. 

Theorem 4.1 Suppose E is of finite measure in with d> 3. Then 
for almost every 7 G 

(i) is measurable for all t G R. 

(ii) md-i{Et^-y) is continuous in t G R. 

Moreover, the function of t defined by pi{t,^) = md-i{Et^-y) satisfies a 
Lipschitz condition with exponent a for any a with 0 < a < 1/2. 

The almost everywhere assertion is with respect to the natural measure 
da on that arises in the polar coordinate formula in Section 3.2 of 
the previous chapter. 

We recall that a function / is Lipschitz with exponent a if 
|/(H) - f{t 2 )\ < A\ti - t2r for some A. 

A significant part of (i) is that for a.e. 7, the slice Et^j is measurable 
for all values of the parameter t. In particular, one has the following. 

Corollary 4.2 Suppose E is a set of measure zero in R"^ with d > 3. 
Then, for almost every 7 G S'^~^, the slice Et^^ has zero measure for all 
t G R. 

The fact that there is no analogue of this when d = 2 is a consequence of 
the existence of a Besicovitch set, (also called a “Kakeya set” ), which is 
defined as a set that satisfies the three conditions in the theorem below. 

Theorem 4.3 There exists a set B in R^ that: 

(i) is compact, 

(ii) has Lebesgue measure zero, 

(in) contains a translate of every unit line segment. 

Note that with F = B and 7 G 5 ^ one has mi{F fi Vto,-y) > 1 for some to- 
li mi{F CiVt^j) were continuous in t, then this measure would be strictly 
positive for an interval in t containing to> c^nd thus we would have 
m 2 {F) > 0, by Fubini’s theorem. This contradiction shows that the ana- 
logue of Theorem 4.1 cannot hold for d = 2. 

While the set B has zero two-dimensional measure, this assertion can- 
not be improved by replacing this measure by a-dimensional Hausdorff 
measure, with a <2. 

Theorem 4.4 Suppose F is any set that satisfies the conclusions (i) 
and (in) of Theorem 4.3. Then F has Hausdorff dimension 2. 
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4.1 The Radon transform 

Theorems 4.1 and 4.4 will be derived by an analysis of the regularity 
properties of the Radon transform TZ. The operator TZ arises in a number 
of problems in analysis, and was already considered in Chapter 6 of 
Book I. 

For an appropriate function / on the Radon transform of / is 
defined by 






The integration is performed over the plane with respect to the 
measure rud-i discussed above. We first make the following simple ob- 
servation: 

1. If / is continuous and has compact support, then / is of course 

integrable on every plane and so TZ{f){t,'y) is defined for all 
(t, 7 ) G R X Moreover it is a continuous function of the pair 

{t, 7 ) and has compact support in the t-variable. 

2. If / is merely Lebesgue integrable, then / may fail to be measurable 
or integrable on Vt,-y for some (t, 7 ), and thus TZ{f){t,^) is not 
defined for those (t, 7 ). 

3. Suppose / is the characteristic function of the set E, that is, / = 
Xe- Then TZ{f){t,x) = rnd-i{Et^j) if Fit, 7 is measurable. 

It is this last property that links the Radon transform to our problem. 
Key estimates in this conclusion involve a maximal “Radon transform” 
defined by 


i^*(/)(7) = sup|7^(/)(^,7)| 


as well as corresponding expressions controlling the Lipschitz character 
of TZ{f){t,x) as a function of t. A basic fact inherent in our analysis 
is that the regularity of the Radon transform actually improves as the 
dimension of the underlying space increases. 

Theorem 4.5 Suppose / is continuous and has compact support in R*^ 
with d > 3. Then 



for some constant c > 0 that does not depend on f . 
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An inequality of this type is a typical “a priori” estimate. It is obtained 
first under some regularity assumption on the function /, and then a 
limiting argument allows one to pass to the more general case when / 
belongs to D 

We make some comments about the appearance of both the L^-norm 
and L^-norm in (2). The T^-norm imposes a crucial local control of 
the kind that is necessary for the desired regularity. (See Exercise 27.) 
However, without some restriction on / of a global nature, the function 
/ might fail to be integrable on any plane as the example f{x) = 
1/(1 + shows. Note that this function belongs to L^(IR‘^) if d > 3, 

but not to 

The proof of Theorem 4.5 actually gives an essentially stronger result, 
which we state as a corollary. 


Corollary 4.6 Suppose f is continuous and has compact support in 
d > 3. Then for any a, 0 < a < 1/2, the inequality (2) holds with 
TZ*{f){j) replaced by 


( 3 ) 


sup 

tl^t2 


l^(/)(H,7) -^(/)(^2,7)l 


The proof of the theorem relies on the interplay between the Radon 
transform and the Fourier transform. 

For fixed 7 G we let 7^(/)(A,7) denote the Fourier transform of 

TZ{ f){t,j) in the t- variable 

/ OO 

7^(/)(^,7)e-2-'"‘d^. 

-OO 

In particular, we use A G R to denote the dual variable of t. 

We also write / for the Fourier transform of / as a function on R'^, 
namely 

/(0= / /(x)e-2--«dx. 

Lemma 4.7 If f is continuous with compact support, then for every 
7 G we have 

n{f){\i) = f{Xi). 


The right-hand side is just the Fourier transform of f evaluated at the 
point A7. 
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Proof. For each unit vector 7 we use the adapted coordinate system 
described above: x = (xi, . . . , Xd) where 7 coincides with the Xd direc- 
tion. We can then write each x G R'* as a: = (u, t) with u G t G R, 

where x ■ 'y = t = Xd and u = {x \, . . . , Xd-i). Moreover 


/ = 




f f{u,t)du, 


and Fubini’s theorem shows that f{x) dx = JZc {iv.,-, f) dt- Ap- 
plying this to /(x)e“^^*®'A')') in place of f{x) gives 


/(A7)= / /(x)e-2---(A7)rf^= /■ ff Jfu,t)du 

J-oc, 


dt 


f ) 


'—00 \ JVt ,- 


Therefore /(A 7 ) = and the lemma is proved. 


Lemma 4.8 If f is continuous with compact support, then 

[ ([ l^(/)(A, 7 )nA|'^“^dA^ dcr( 7 ) = 2 / \f{x)\'^dx. 

\J-oo J 7R<i 

Let us observe the crucial point that the greater the dimension d, the 
larger the factor |A|‘^“^ as |A| tends to infinity. Hence the greater the 
dimension, the better the decay of the Fourier transform TZ{f){X,^), 
and so the better the regularity of the Radon transform IZ{f){t,^) as a 
function of t. 

Proof. The Plancherel formula in Chapter 5 guarantees that 

2/ \f{x)\^dx = 2[ 

Js.‘‘ Jm'^ 

Changing to polar coordinates f = X'f where A > 0 and 7 G we 

obtain 

2 / \m\^df = 2[ [ |/(A7)|2A"-idAdc7(7). 

JM'^ J Jo 

We now observe that a simple change of variables provides 

p p<x> p pO 

/ / l/(A 7 )rA‘^“^dAdcr( 7 ) = / / |/(A 7 )nA|‘^“^ dA dcr( 7 ), 

J Jq j J —00 
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and the proof is complete once we invoke the result of Lemma 4.7. 

The final ingredient in the proof of Theorem 4.5 consists of the follow- 
ing: 

Lemma 4.9 Suppose 



nOO 

F{t) = / 

J — oo 

F(A)e2"^* dA, 

where 




sup T(A) < A and 

nOO 

/ i^(A)nAi 



J — oo 

Then 



( 4 ) 

sup|F(t)| 

< c{A + B). 






Moreover, if 0 < a < 1/2, then 

(5) \F{ti) - F{t 2 )\ < Ca\ti -t 2 \°'{A + B) forallti,t 2 . 

Proof. The first inequality is obtained by considering separately the 
two cases |A| < 1 and |A| > 1. We write 

F{t)= [ F(A)e2"*^‘ dA + [ F{X)e^^^^UX. 

4|A|<1 J\X\>1 

Clearly, the first integral is bounded by cA. To estimate the second inte- 
gral it suffices to bound dX. An application of the Cauchy- 

Schwarz inequality gives 

[ \F{X)\dX<( [ |F(A)nA|‘'-idA') ^ ( [ lAl-'^+idA) ' . 

4|A|>1 V4|A|>1 / V4|A|>1 / 

This last integral is convergent precisely when — d -|- 1 < —1, which is 
equivalent to d > 2, namely d > 3, which we assume. Hence \F{t)\ < 
c{A + B) as desired. 

To establish Lipschitz continuity, we first note that 


F{h) - F{t2) = / F{X) dX. 
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Since one has the inequality^ |e“ — 1| < |a:|, we immediately see that 

|g27rzAti _ g2,riAt2| < if Q < a < 1. 


We may then write the difference F{ti) — F{t 2 ) as a sum of two inte- 
grals. The integral over |A| < 1 is clearly bounded by cA\ti — t 2 |“- The 
second integral, the one over |A| > 1, can be estimated from above by 

f |F(A)||ArdA. 

An application of the Cauchy-Schwarz inequality show that this last in- 
tegral is majorized by 





I ^ I — d-j-l+2a 



1/2 


< CaB, 


since the second integral is finite if— d-|-l-|- 2 a< — 1 , and in particular 
this holds if a < 1/2 when d > 3. This concludes the proof of the lemma. 


We now gather these results to prove the theorem. For each 7 S 5"^ ^ 
let 


F(^)=7^(/)(^,7). 

Note that with this definition we have 

sup|F(^)|=7^*(/)(7). 

tgE 


Let 


A( 7 ) = sup |F(A)| and B‘^{'y) 

A 



|.F(A)nA/-idA. 


Then by (4) 


sup|F(t)| < c(A( 7 ) -h 5 ( 7 )). 

tgE 


However, we observed that F{X) = /(A 7 ), and hence 


A(7) < ll/llLbE-i)- 


^The distance in the plane from the point to the point 1 is shorter than the length 
of the arc on the unit circle joining them. 
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Therefore 


|7^*(/)(7)P<c(A(7)2 + i3(7f) 


and thus 


Jsa-I + ll/lli^(K'*)) 


since J da{j) = 2 \\f\\l 2 by Lemma 4.8. Consequently, 


f T^*U){l) daij) < c(||/||Li(Rd) + ||/||L2(Rd)). 


Note that the identity we have used, 



J — OO 


with F{t) = 7^(/)(t,7), is justified for almost every 7 G 5"^“^ by the 
Fourier inversion result in Theorem 4.2 of Chapter 2. Indeed, we have 
seen that ^4(7) and B{-j) are finite for almost every 7, and thus F is 
integrable for those 7. This completes the proof of the theorem. The 
corollary follows the same way if we use (5) instead of (4). 

We now return to the situation in the plane to see what information 
we may deduce from the above analysis. The inequality (2) as it stands 
does not hold when d = 2. However, a modification of it does hold, and 
this will be used in the proof of Theorem 4.4. 


If / G we define 




^(/)(s,7) ds 


t-s 



In this definition of TZs{f){t,^) we integrate the function / in a small 
“strip” of width 25 around the plane Vt^^y- Thus TZs is an average of 
Radon transforms. 


We let 


^K/)(7) = sup|7^^(/)(^,7)|. 
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Theorem 4.10 If f is continuous with compact support, then 

[ ^K/)(7) da{j) < c(log (||/||li(r 2 ) + ||/||l 2 (r 2 )) 

Js^ 


when 0 < 5 < 1/2. 


The same argument as in the proof of Theorem 4.5 applies here, except 
that we need a modified version of Lemma 4.9. More precisely, let us set 


m) = 

and suppose that 


f 


F{\) 


g27rz(t— (5)A 


27r*A(2(5) 


d\, 


sup |T(A)| < A and 

A 


|T(A)nA|dA < B. 


Then we claim that 

(6) sup \Fs{t)\ < c(log + B). 


Indeed, we use the fact that |(sinx)/a;| < 1 to see that, in the definition 
of Fs{t), the integral over |A| < 1 gives the cA. Also, the integral over 
|A| > 1 can be split and is bounded by the sum 


/ 


l<|A|<l/5 


|F(A)|dA+- 


'l/5<|A| 


\F{X)\\XrUX. 


The first integral above can be estimated by the Cauchy-Schwarz in- 
equality, as follows 


/ |T(A)|dA<c 

'l<|A|<l/5 


/ l^(A)nA|dA 


1/2 


/ lAr^dA 

L<|A|<1/5 


1/2 


< cB{log 1/(5)^/^. 
Finally, we also note that 


/ |F(A)||AridA < c 

1/-5<|A| 

< cB 


/ imnAidA 

'l/5<|A| 


1 / 2 . 


/ lAr^dA 

'l/5<|A| 


.1/2 


and this establishes (6), and hence the theorem. 
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4.2 Regularity of sets when d > 3 

We now extend to the general context the basic estimates for the Radon 
transform, proved for continuous functions of compact support. This will 
yield the regularity result formulated in Theorem 4.1. 

Proposition 4.11 Suppose d > 3, and let f belong to n 

Then for a.e. 7 S we can assert the following: 

(a) / is measurable and integrable on the plane Vt,-y, for every t € R. 

(b) The function TZ{f){t,'y) is continuous in t and satisfies a Lips- 
chitz condition with exponent a for each a < 1/2. Moreover, the 
inequality (2) of Theorem f.5 and its variant with (3) hold for f. 

We prove this in a series of steps. 

Step 1 . We consider f = xo, the characteristic function of a bounded 
open set (D. Here the assertion (a) is evident since (D D Vt^-y is an open 
and bounded set in Vt,-y and is bounded. Thus TZ{f){t,x) is defined for 
all (t,7). 

Next we can find a sequence {/«} of non-negative continuous func- 
tions of compact support so that for every x, fn{x) increases to f{x) as 
n — > 00. Thus TZ{fn){t,x) — > ’^(/)(t,7) for every {t,x) by the monotone 
convergence theorem, and also Tl*{fn){l) T^*{f){l) for each 7 e 
As a result we see that the inequality (2) is valid for f = xo-, with O 
open and bounded. 

Step 2. We now consider / = xe-, where £’ is a set of measure zero, 
and take first the case when the set E is bounded. Then we can find a 
decreasing sequence {On} of open and bounded sets, such that E C On, 
while m{On) — > 0 as n — > 00. 

Let i? = Pi On- Since E n Vt.-y is measurable for every (t, 7), the func- 
tions 77.(x^)(t, 7) and 77.* (y^) (7) are well-defined. However, 77* (y^) (7) < 
77*(yc>„)(7), while the 77*(yc>„) decrease. Thus the inequality (2) we 
have just proved for / = xo„ shows that TI*{xe){i) = 0 for a.e. 7. The 
fact that E C E then implies that for a.e. 7, the set E n Vt^-y has {d — 1)- 
dimensional measure zero for every t G M. This conclusion immediately 
extends to the case when E is not necessarily bounded, by writing 77 as a 
countable union of bounded sets of measure zero. Therefore Corollary 4.2 
is proved. 

Step 3. Here we assume that / is a bounded measurable function 
supported on a bounded set. Then by familiar arguments we can find 
a sequence [fn] of continuous functions that are uniformly bounded. 
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supported in a fixed compact set, and so that fn{x) — > f{x) a.e. By the 
bounded convergence theorem, \\fn — /||li and ||/n — /||l2 both tend to 
zero as u — > oo, and upon selecting a subsequence if necessary, we can 
suppose that ||/„ - /||li + ||/„ - /||i,2 <2 By what we have just 
proved in Step 2 we have, for a.e. 7 G that fn{x) — > f{x) on Vt,-y 

a.e. with respect to the measure nid-i, for each t G M. Thus again by the 
bounded convergence theorem for those (t,7), we see that > 

and this limit defines TZ{f)- Now applying Theorem 4.5 to 
fn - fn-i gives 

00 « 00 

yZ ^*(/n - /n-l)(7)c;o-(7) < C V2“’" < 00. 

n=l 


This means that 


^sup|7^(/„)(^,7) -7^(/n-l)(^,7)| < 00 , 


for a.e. 7 G 5'^“^, and hence for those 7 the sequence of functions TZ{fn){t, 7) 
converges uniformly. As a consequence, for those 7 the function TZ{ f){t, 7) 
is continuous in t, and the inequality (2) is valid for this /. The inequality 
with (3) is deduced in the same way. 

Finally, we deal with the general / in n by approximating it by 
a sequence of bounded functions each with bounded support. The details 
of the argument are similar to the case treated above and are left to the 
reader. 

Observe that the special case f = Xe of the proposition gives us The- 
orem 4.1. 

4.3 Besicovitch sets have dimension 2 

Here we prove Theorem 4.4, that any Besicovitch set necessarily has 
Hausdorff dimension 2. We use Theorem 4.10, namely, the inequality 

/ dcrix) < c(log 1/(5)^/^ (||/||li(m 2 ) + ||/||l 2 (m 2)) . 

Js^ 

This inequality was proved under the assumption that / was continuous 
and had compact support. In the present situation it goes over without 
difficulty to the general case where / G by an easy limiting 

argument, since it is clear that Tl*s{fn){n) converges to for all 

7 if /„ — > / in the T^-norm. 
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Now suppose F is a Besicovitch set and a is fixed with 0 < a < 2. 
Assume that F C Ui=i ^ covering, where Bi are balls with diameter 
less than a given number. We must show that 

y^(diam Bi)°‘ > Cq, > 0. 

i 

We proceed in two steps, considering first a simple situation that will 
make clear the idea of the proof. 

Case 1. We suppose first that all the balls Bi have the same diameter 
5 (with 6 < 1/2) and also that there are only a finite number, say N, of 
balls in the covering. We must prove that A'lJ" > Cq,. 

Let B* denote the double of Bi and F* = [JiB*. Then, we clearly 
have 


m{F*) < cN6'^. 

Since F is a Besicovitch set, for each 7 e there is a segment of 
unit length, perpendicular to 7, and which is contained in F. Also, by 
construction, any translate by less than (5 of a point in s-y must belong 
to F*. Hence 


^Kxf-)( 7) > 1 for every 7. 

If we take / = xf* in the inequality (6), and note that the Cauchy- 
Schwarz inequality implies 

IIxf* ||li(m2) < c||xf* ||l2(m2) < c(m(F*))fo^, 

then we obtain 

c < N^^'^d{log 1/5)^/^. 

This implies A" <5“ > c for a < 2. 

Case 2. We now treat the general case. Suppose F c U^i where 
the balls Bi each have diameter less than 1. For each integer k let Nk be 
the number of balls in the collection {Bi} for which 

2“'=“i < diam Bi < 2“'=. 

We need to show that In fact, we shall prove the 

stronger result that there exists a positive integer k' such that Nk'2~^'°‘ > 

^Ot • 
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Let 


ft=Fn 


u 


B, 


.2-'“-i<diam Bi<2- 


and let 

n = U B’’ 

2-fc-i<diam Bi<2-’‘ 

where B* denotes the double of Bi. Then we note that 
mi{Fl) <cNk2~‘^^ for all fc. 

Since F is a Besicovitch set, for each 7^5'^ there is a segment of 
unit length entirely contained in F . We now make precise the fact that 
for some k, a large proportion of belongs to F^- 

We pick a sequence of real numbers 0 < a/j < 1, 

= 1, but Ok does not tend to zero too quickly. For instance, we 
may choose with = 1 — 2“*^, and e > 0 but e sufficiently 

small. 

Then, for some k we must have 


mi{s^ n Fk) > Ofe. 


Otherwise, since F = jj F/j, we would have 

mi{s-y n F) < Ok = 1, 

and this contradicts the fact that D F) = 1, since s^y is entirely 

contained in F. 

Therefore, with this k, we must have 

^2-dXF*)(7) > ak, 

because any point of distance less than 2“^ from Fk must belong to F^. 
Since the choice of k may depend on 7, we let 

Ffe = {7 : > ak). 

By our previous observations, we have 

00 

S^=\J Ek, 

k=l 
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and so for at least one k, which we denote by k' , we have 


m{Ek') > 2nak’, 


for otherwise m{Si) < 2TT^ak = 2 tt. As a result 
2'ITCLkf — 2'7TQik'^k' 

< / ak' da{^) 


J Si 

By the fundamental inequality (6) we get 

al, < c{\og2^'f/‘^\\xF;,\\L\'R^)- 


Recalling that by our choice ak ^ 2 and noting that ||af*, IIl^ < 
, we obtain 


2(i-2e)fc' < c(log2'=y/2jYi/2_ 

Finally, this last inequality guarantees that Nk'2~°‘’^ > Ca as long as 
4e < 2 - a. 

This concludes the proof of the theorem. 

4.4 Construction of a Besicovitch set 

There are a number of different constructions of Besicovitch sets. The one 
we have chosen to describe here involves the concept of self-replicating 
sets, an idea that permeates much of the discussion of this chapter. 

We consider the Cantor set of constant dissection C 1/21 which for sim- 
plicity we shall write as C, and which is defined in Exercise 3, Chapter 1. 
Note that C = where Cq = [0,1], and Ck is the union of 2^ 

closed intervals of length 4“^ obtained by removing from Ck-i the 
centrally situated open intervals of length ^ ■ 4“^+^. The set C can also 
be represented as the set of points x € [0, 1] of the form x = YlT=i ^^1^ ■> 
with tk either 0 or 3. 

We now place a copy of C on the x-axis of the plane = {(x, y)}, and a 
copy of \C on the line y = 1. That is, we put Eq = {(x, y) : x & C, y = 0} 
and El = {{x,y) : 2x € C, y = 1}. The set E that will play the central 
role is defined as the union of all line segments that join a point of Eg 
with a point of Ei. (See Figure 13.) 
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Figure 13. Several line segments joining i?o with Ei 


Theorem 4.12 The set F is eompaet and of two-dimensional measure 
zero. It eontains a translate of any unit line segment whose slope is a 
number s that lies outside the intervals ( — 1,2). 

Once the theorem is proved, our job is done. Indeed, a finite union of 
rotations of the set F contains unit segments of any slope, and that set 
is therefore a Besicovitch set. 

The proof of the required properties of the set F amounts to showing 
the following paradoxical facts about the set C + XC, for A > 0. Here 
C + AC = {xi + Xx2 : xi S C, X2 S C}: 

• C + AC has one-dimensional measure zero, for a.e. A. 

• C -I- ^C is the interval [0,3/2]. 

Let us see how these two assertions imply the theorem. First, we note 
that the set F is closed (and hence compact), because both Fq and Fi 
are closed. Next observe that with 0 < y < 1, the slice F^ of the set 
F is exactly (1 — y)C + |C. This set is obtained from the set C -|- AC, 
where A = y/{2{\ — y)), by scaling with the factor 1 — y. Hence F^ is of 
measure zero whenever C -|- AC is also of measure zero. Moreover, under 
the mapping y i— > A, sets of measure zero in (0, oo) correspond to sets of 
measure zero in (0, 1). (For this see, for example. Exercise 8 in Chapter 1, 
or Problem 1 in Chapter 6.) Therefore, the first assertion and Fubini’s 
theorem prove that the (two-dimensional) measure of F is zero. 

Finally the slope s of the segment joining the point (a:o,0), with the 
point {xi, 1) is s = ^/{xi — xq). Thus the quantity s can be realized if 
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xi e C/2 and xq £ C, that is, if 1/s £ C/2 — C. However, by an obvious 
symmetry C = 1 — C, and so the condition becomes l/s£C/2 + C — 1, 
which by the second assertion is 1/s £ [— 1,1/2]. This last is equivalent 
with s ^ ( — 1, 2). 

Our task therefore remains the proof of the two assertions above. The 
proof of the second is nearly trivial. In fact. 


2 

3 


C+^C 


'c+'c 

3 ^ ^ 3 ^’ 


and this set consists of all x of the form = + where 

Cfe and e/ are independently 0 or 3. Since then ^ ^ can take any 

of the values 0, 1, 2, or 3, we have that § (C + ^C) = [0, 1], and hence 
C+ iC= [0,3/2]. 


The proof that m(C + AC) = 0 for a.e. A 

We come to the main point: that C + AC has measure zero for almost all 
A. We show this by examining the self-replicating properties of the sets 
C and C -|- AC. 

We know that C = Ci UC2, where Ci and C2 are two similar copies 
of C, obtained with similarity ratio 1/4, and given by Ci = jC and 
C2 = \C +\. Thus Cl C [0, 1/4] and C2 C [3/4, 1]. Iterating I times this 
decomposition of C, that is, reaching the “generation,” we can write 

(7) c= U d. 

l<i<2« 


with Cf = (I/4)^C and each Cj a translate of Cf. 

We consider in the same way the set 

/C(A) = C -d AC, 

and we shall sometimes omit the A and write Af(A) = AC, when this causes 
no confusion. By its definition we have 

AC = ACi U AC2 U AC3 U AC4, 

where ACi = Ci -1- ACi, AC2 = Ci -h AC2, AC3 = C2 -f ACi, and AC4 = C2 ~\~ AC2. 
An iteration of this decomposition using (7) gives 

U ^ 7 , 

l<i<4^ 


( 8 ) 
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where each ICf equals for a pair of indices ji, j 2 - In fact, 

this relation among the indices sets up a bijection between the i with 
1 < i < 4^, and the pair ji,j 2 with 1 < ji < 2^ and 1 < ^2 < 2^. Note 
that each JCj is a translate of IC{ , and each /Cf is also obtained from JC by 
a similarity of ratio 4“^. Now note that C = C/4|J(C/4+3/4) implies 
that 

A A 3A 

IC{X) = C + XC = {C + -C) U (C + -C + — ) 

Q \ 

= /C(A/4)U(/C(A/4) + -). 

Thus IC{X) has measure zero if and only if /C(A/4) has measure zero. 
Hence it suffices to prove that IC{X) has measure zero for a.e. A G [1,4]. 

After these preliminaries let us observe that we immediately obtain 
that m(/C(A)) = 0 for some special A’s, those for which the following 
coincidence takes place: for some i and a pair i and i' with i ^ i', 

/c^(a) = 4(a). 

Indeed, if we have this coincidence, then (8) gives 


4 ^^ 

m(/C(A))< m(/Cf(A)) = (4^^ - 1)4-V(/C(A)), 

2=1, 27^2' 

and this implies m{IC{X)) = 0. 

The key insight below is that, in a quantitative sense, the A’s for which 
this coincidence takes place are “dense” relative to the size of i. More 
precisely, we have the following. 

Proposition 4.13 Suppose Aq and i are given, with 1 < Aq < 4 and i 
a positive integer. Then, there exist a X and a pair i,i' with i^ i' such 
that 

(9) /Cf(A) = JCiiX) and |A - Aq] < 04“^. 

Here c is a constant independent of Aq and 1. 

This is proved on the basis of the following observation. 

Lemma 4.14 For every Aq there is a pair 1 < ii,i 2 < 4, with ii 7 ^ 22 
such that ICi^{Xo) and ICi.^{Xo) intersect. 
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Proof. Indeed, if the ICi are disjoint for 1 < z < 4 then for sufficiently 
small 5 the ICf are also disjoint. Here we have used the notation that 
denotes the set of points of distance less than 5 from F. (See Lemma 3.1 
in Chapter 1.) However, JC^ = by similarity = 

4m(/Cf). Thus by the disjointness of the ICf we have m{IC^) = 
which is a contradiction, since — IC^ contains an open ball (of radius 
35/2). The lemma is therefore proved. 

Now apply the lemma for our given Aq and write ICi^ = Cf_i^ + XoC ,^^ , 
ICi 2 = Cfj ,2 + XoCu 2 ! where the //’s and Fs are either 1 or 2. However, since 
ii 7 ^ i 2 we have /zi ^2 or ui ^ V 2 (or both). Assume for the moment 
that V 2 - 

The fact that /Ci^(Ao) and ICi^iXo) intersect means that there are pairs 
of numbers (a, b) and (o', b'), with a G b S a' G C^ 2 , and b' G 
such that 

(10) Ct + Ag^ = T XqP . 

Note that the fact that ui ^ V 2 means that \b — b'\ > 1/2. Next, look- 
ing at the generation we find via (7) that there are indices 1 < 
jij 2 ,j'i,j 2 < 2^ so that a G C 5 G C a' G Cj, C b' G 
Cj, C ■ We also observe that the above sets are translates of each 
other, that is, Cf, = Cf, + n and = Cf, + T 2 , with 1x^1 < 1. Hence if 
i and i' correspond to the pairs (jii/ 2 ) and respectively, we have 

(11) JCf{X) = JCffX) + t{X) with t(A) = ti - h At 2 . 

Now let (A, B) be the pair that corresponds to (o', b') under the above 
translations, namely 

(12) A = a' F Ti, B = b' F T 2 . 

We claim there is a A such that 

(13) AFXB = a'FXb'. 

In fact, by (12) we have put H in Cf C , while b' is in Ci C . Thus 
\B — b'\ > 1/2, since vi 7 ^ 1 / 2 . We can therefore solve (13) by taking 
A = (A — a')/{b' — B). Now we compare this with (10), and get Ag = 
(a — a')/{b' — b). Moreover, |A — a| < 4“^ and \B — b\ < 4“^, since A 
and a both lie in , and B and b he in . This yields the inequality 


(14) 


|A- Ag| < c4-L 
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Also, (12) and (13) clearly imply t(A) = ti + At 2 = 0, and this together 
with (11) proves the coincidence. 

Therefore our proposition is proved under the restriction we made 
earlier that vi ^ V 2 . The situation where instead ^ ^2 is obtained 
from the case vi ^ V 2 if we replace Aq by Ag Note that Alf(Ao) = 
/Cf, (Ao) if and only if Cf, + AqCI = Cf, + AoCf, and this is the same as 
cf + Ag ^Cf = cf, + Ag ^Cf, . This allows us to reduce to the case ui ^ 
fj. 2 , since Cf^ C C^j and Cj, C Here the fact that 1 < Aq < 4 gives 

Aq ^ < 1 and guarantees that the constant c in (9) can be taken to be 
independent of Aq- The proposition is therefore established. 

Note that as a consequence, the following holds near the points A where 
the coincidence (9) takes place: If |A — A| < e4“^, then 

(15) fCf(A) = /Cf/(A) + r(A) with |r(A)| < e4“^. 

In fact, this is (II) together with the observation that 

|t(A)| = |t(A) -t(A)| < |A- A|, 

since |t(A)| = ti + At 2 and |t 2 | < I. 

The assertion (15) leads to the following more elaborate version of 
itself: 


There is a set A of full measure sueh that whenever A G A 
and e > 0 are given, there are £ and a pair i,i' so that (15) 
holds.^ 

Indeed, for fixed e > 0, let A^ denote the set of A that satisfies (15) for 
some i, i and i' . For any interval / of length not exceeding 1, we have 

m(Ae n /) > e4“^ > 

because of (9) and (15). Thus Af has no points of Lebesgue density, 
hence Af has measure zero, and thus Ae is a set of full measure. (See 
Corollary 1.5 in Chapter 3.) Since A = A^ decreases with e, 

we see that A also has full measure and our assertion is proved. 

Finally, our theorem will be established once we show that m{K.{\)) = 
0 whenever A G A. To prove this, we assume contrariwise that m{K.{\)) > 
0. Using again the point of density argument, there must be for any 


^The terminology that A has “full measure” means that its complement has measure 


zero. 
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0 < (5 < 1, a non-empty open interval I with m(/C(A) n /) > 5m{I). We 
then fix 5 with \/2 < 5 < 1 and proceed. With this fixed 5, we select 
e used below as e = m(/)(l — 6). Next, find £, i, and i' for which (15) 
holds. The existence of such indices is guaranteed by the hypothesis that 
A e A. 

We then consider the two similarities (of ratio 4“^) that map IC{X) to 
/Cf(A) and /C|, (A), respectively. These take the interval / to correspond- 
ing intervals T and T', respectively, with m{Ii) = m{Ii') = 4“^m(/). 
Moreover, 

m(/Cf n li) > 5m{Ii) and m{ICf, H li') > 6m{Ii>). 

Also, as in (15), Ii< = Ii + t(A), with |t(A)| < e4“^. This shows that 
m{Ii n Ii>) > m{Ii) — r(A) > 4r^m{I) — e4“^ > Sm{Ii), 
since e4“^ = (1 — 6)m{Ii). Thus m{Ii — T fl /*/) < (1 — and 

m(/C- n n 7^/) > m(/C- n li) - m{Ii - /* n Ij/) 

> {26 - l)m{Ii) 

> > ^rn{I, n 7^/). 

So m{ICl n 7i n 7i') > \m{Ii n hi) and the same holds for i' in place of i. 
Hence m(/Cf n /Cf,) > 0, and this contradicts the decomposition (8) and 
the fact that m(/Cf) = 4 ^ m(Al) for every i. Therefore we obtain that 
m(/C(A)) = 0 for every A € A, and the proof of Theorem 4.12 is now 
complete. 

5 Exercises 

1 . Show that the measure me, is not cr-finite on R'* if a < d. 

2 . Suppose El and E2 are two compact subsets of R'* such that Ei n E2 contains 
at most one point. Show directly from the definition of the exterior measure that 
if 0 < Q < d, and E = Ei U E2, then 

m*e,{E) = m* (£1) -I- m*a{E2). 

[Hint: Suppose Ei C] E2 = {a:}, let denote the open ball centered at x and of 
diameter e, and let E^ = E n B^. Show that 

mliEl > HUE) > mUE) - U^) - e“, 
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where fi{e) 0. Hence m’^{E).] 

3. Prove that if / : [0, 1] — > R satisfies a Lipschitz condition of exponent 7 > 1, 
then / is a constant. 

4. Snppose / : [0, 1] — > [0, 1] x [0, 1] is snrjective and satisfies a Lipschitz condition 

1 /( 2 :) - f{y)\ < C\x - y\^. 

Prove that 7 < 1/2 directly, withont using Theorem 2.2. 

[Hint: Divide [0, 1] into N intervals of equal length. The image of each sub-interval 
is contained in a ball of volume and the union of all these balls must 

cover the square.] 

5. Let f{x) = be defined on R, where A: is a positive integer and let i? be a 
Borel subset of R. 

(a) Show that if ma{E) = 0 for some a, then ma{f{E)) — 0. 

(b) Prove that dim(i?) = dim /(if). 


6. Let {iffc} be a sequence of Borel sets in R'*. Show that if dimiffc < a for some 
a and all fc, then 

dim^Jiffc < a. 

k 


7. Prove that the (log 2/ log 3)-Hausdorff measure of the Cantor set is precisely 
equal to 1 . 

[Hint: Suppose we have a covering of C by finitely many closed intervals {Ij}. 
Then there exists another covering of C by intervals {//} each of length 3“*^ for 
some k, such that [/j[“ > J2e ^ li where a = log2/log3.] 

8 . Show that the Cantor set of constant dissection, C^, in Exercise 3 of Chapter 1 
has strict Hausdorff dimension log 2/ log(2/(l — ^)). 

9. Consider the set x Cjj in R^, with Cj as in the previous exercise. Show that 

X Q 2 has strict Hausdorff dimension dim(Q^) -I- dim(Q 2 ). 

10 . Construct a Cantor-like set (as in Exercise 4, Chapter 1) that has Lebesgue 
measure zero, yet Hausdorff dimension 1. 

[Hint: Choose £ 1 ,^ 2 , ■ ■ ■ ,£fe, ... so that 1 — tends to zero sufficiently 

slowly as A: ^ 00 .] 

11. Let V — Vfj, be the Cantor dust in R^ given as the product Q x Q, with 
( 1 - 0 / 2 . 
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(a) Show that for any real number A, the set + AC 5 is similar to the projection 
of V on the line in with slope A = tand. 

(b) Note that among the Cantor sets Q, the value ^ = 1/2 is critical in the 
construction of the Besicovitch set in Section 4.4. In fact, prove that with 
^ > 1/2, then Q + AQ has Lebesgue measure zero for every A. See also 
Problem 10 below. 

[Hint: ma(C^ + AQ) < 00 for a — dimD^.] 

12 . Define a primitive one-dimensional “measure” mi as 

00 00 

mi = inf ^ diam F*,, F C [J Fc. 

fc=i fc=i 

This is akin to the one-dimensional exterior measure m* , a = 1, except that no 
restriction is placed on the size of the diameters Fk- 

Suppose li and I 2 are two disjoint unit segments in R'*, d> 2, with Ii = I 2 + h, 
and \h\ < e. Then observe that mi(/i) = mi(/ 2 ) = 1, while mi(7i U I 2 ) < 1 -I- e. 
Thus 

mi(7i U I 2 ) < rhi{Ii) + mi ( 72 ) when e < 1; 
hence mi fails to be additive. 

13. Consider the von Koch curve K,^ , 1/4 < I < 1/2, as dehned in Section 2.1. 
Prove for it the analogue of Theorem 2.7: the function t !C^{t) satisfies a Lip- 
schitz condition of exponent 7 = log(l/7)/ log 4. Moreover, show that the set 
has strict Hausdorff dimension a = l/'y. 

[Hint: Show that if O is the shaded open triangle indicated in Figure 14, then O D 
So(0) U Si(0) U 52 ( 0 ) U SsiO), where So(x) = £x, Si(x) = pe(£x) + a, Szix) = 
Pg^(£x) + c, and S 3 (x) = £x + b, with ps the rotation of angle 0. Note that the 
sets Sj{0) are disjoint.] 


c 



Figure 14. The open set O in Exercise 13 


14. Show that if £ < 1/2, the von Koch curve t 1 — > IC^{t) in Exercise 13 is a simple 


curve. 
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[Hint: Observe that if t = with aj = 0, 1, 2, or 3, then 


i=i 


15. Note that if we take £ = 1/2 in the definition of the von Koch curve in 
Exercise 13 we get a “space-hlling” curve, one that hlls the right triangle whose 
vertices are (0,0), (1,0), and (1/2, 1/2). The hrst three steps of the construction 
are as in Figure 15, with the intervals traced out in the indicated order. 


2 1 

^ 2 

8 

7 

6 

3 5 

9 

10 

11 

12 14 

15 

1 

4 1 

4 

13 

16 


Figure 15. The first three steps of the von Koch curve when £ = 1/2 


16. Prove that the von Koch curve t 1/4 < i < 1/2 is continuous but 

nowhere differentiable. 

[Hint: If IC'{t) exists for some t, then 

JC{Un) — IC{Vn) 
n^oa Un — Un 

must exist, where Un < t < Vn, and Un — v„ —> 0- Choose Un = kj 4^ and Vn = 

(fc + l)/4".l 

17. For a compact set E in R'*, define #(e) to be the least number of balls of 
radius e that cover E. Note that we always have #(e) = 0{e~‘‘) as e — > 0, and 
#{e) = 0(1) if E is hnite. 

One defines the covering dimension of E, denoted by dimc'(if), as inf /3 such 
that #(e) = 0(e“^), as e — > 0. Show that dimc(F) = dimM(F), where dimir is the 
Minkowski dimension discussed in Section 2.1, by proving the following inequalities 
for all 5 > 0: 
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(i) m{E^) < c#{6)S‘^. 

(ii) #(5) < 

[Hint: To prove (ii), use Lemma 1.2 in Chapter 3 to find a collection of disjoint 
balls Bi, B 2 , . . . , Bn of radius 5/3, each centered at E, such that their “triples” 
Bi, B 2 , . . . , Bn (of radius 5) cover E. Then #(5) < N, while Nm{Bj) = cN5‘^ < 
m{E^), since the balls Bj are disjoint and are contained in E^ .] 


18. Let B be a compact set in 

(a) Prove that dim(B) < dimM(B), where dim and dimM are the Hausdorff and 
Minkowski dimensions, respectively. 

(b) However, prove that if B = {0, 1/ log 2, 1/ log 3, . . . , 1/ log n, . . .}, then 
dimM B = 1, yet dimB = 0. 


19. Show that there is a constant Cd, dependent only on the dimension d, such 
that whenever B is a compact set, 

m{E'^^) < Cdm{E^). 

[Hint: Consider the maximal function /*, with / = and take Cd = 6 “^.] 

20. Show that if B is the self-similar set considered in Theorem 2.12, then it has 
the same Minkowski dimension as Hausdorff dimension. 

[Hint: Each Fk is the union of balls of radius cr^ . In the converse direction one 
sees by Lemma 2.13 that if e = r*’, then each ball of radius e can contain at most 
d vertices of the A:*** generation. So it takes at least rrd /c' such balls to cover B.] 

21. From the unit interval, remove the second and fourth quarters (open intervals). 
Repeat this process in the remaining two closed intervals, and so on. Let B be the 
limiting set, so that 


F = {x ■. x = fli, = 0 or 2}. 

Prove that 0 < mi/ 2 {F) < 00 . 

22. Suppose B is the self-similar set arising in Theorem 2.9. 

(a) Show that if m < l/r"*, then md{Fi n Fj) = 0 if i 7 ^ j. 

(b) However, if m > l/r'*, prove that Fi n Fj is not empty for some i j- 

(c) Prove that under the hypothesis of Theorem 2.12 

ma{Fi n Fj) = 0, with a = log m/ log(l/r), whenever i 7 ^ j. 
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23. Suppose Si, . . . , Sm are similarities with ratio r, 0 < r < 1. For each set E, 
let 

S{E)^Si{E)U---US^{E), 


and suppose F denotes the unique non-empty compact set with S{F) = F. 

(a) IfxGF, show that the set of points is dense in F. 

(b) Show that F is homogeneous in the following sense; if xq € F and B is 
any open ball centered at xo, then F D B contains a set similar to F. 


24 . Suppose i? is a Borel subset of R"* with dim if < 1. Prove that E is totally 
disconnected, that is, any two distinct points in E belong to different connected 
components. 

[Hint: Fix x,y £ E, and show that /(t) = \t — x\ is Lipschitz of order 1, and hence 
dim /(if) < 1. Conclude that /(if) has a dense complement in R. Pick r in the 
complement of /(if) so that 0 < r < f{y), and use the fact that E = {t £ E ■. 
\t — x\ <r}U {t £ E ■. \t — x\ > r}.] 

25 . Let F{t) be an arbitrary non-negative measurable function on R, and 7 € 

Then there exists a measurable set E in R"^, such that F{t) = md-i{E n 

26 . Theorem 4.1 can be refined for d > 4 as follows. 

Define C*’“ to be the class of functions F{t) on R that are and for which 
F^^\t) satisfies a Lipschitz condition of exponent a. 

If E has finite measure, then for a.e. 7 G 5"^“^ the function m{E n Pt,-y) is in 
for k = {d — 3)/2, a < 1/2, if d is odd, d > 3; and for, k = {d — 4)/2, a <1, 
if d is even, d > 4. 

27 . Show that the modification of the inequality (2) of Theorem 4.5 fails if we 
drop ||/||i 2 (]i{d) from the right-hand side. 

[Hint: Consider TV {fS}, with /^ defined by ft{x) = ([*[ -|- , for \x\ < 1, with 

5 fixed, 0 < d < 1, and e — > 0.[ 

28 . Construct a compact set E C R'^, d> 3, such that md{E) = 0, yet E contains 
translates of any segment of unit length in R**. (While particular examples of such 
sets can be easily obtained from the case d — 2, the determination of the least 
Hausdorff dimension among all such sets is an open problem.) 


6 Problems 

1. Carry out the construction below of two sets U and V so that 
dimf/ = dim!/ = 0 but dim(f/ x H) > 1. 


Let 7i, . . . , 7„, . . . be given as follows: 
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• Each Ij is a finite sequence of consecutive positive integers; that is, for all j 

Ij = {n G N : Aj < n < Bj} for some given Aj and Bj. 

• For each j, 7j+i is to the right of Ij; that is, Aj+i > Bj. 

Let U C [0, 1] consist of all x which when written dyadically x = .aia 2 •••«„••• 
have the property that a„ = 0 whenever n C IJj h ■ Assume also that Aj and Bj 
tend to infinity (as j — > oo) rapidly enough, say Bj/Aj oo and Aj^xjBj oo. 
Also, let Jj be the complementary blocks of integers, that is. 


Jj = {n e N : Bj < n < Aj+i}. 


Let V C [0, 1] consist of those x = .aia 2 ■ ■ ■ an ■ ■ ■ with a„ = 0 if n £ [Jj Jj- 
Prove that U and V have the desired property. 


2 .* The iso-diametric inequality states the following: If B is a bounded subset of 
and diam E = sup{|a: — y\ ■. x,y £ E}, then 


m{E) < Vd 


/ diam E 


where Vd denotes the volume of the unit ball in R‘*. In other words, among sets of 
a given diameter, the ball has maximum volume. Clearly, it suffices to prove the 
inequality for E instead of B, so we can assume that B is compact. 

(a) Prove the inequality in the special case when B is symmetric, that is, —x £ B 
whenever x £ E. 


In general, one reduces to the symmetric case by using a technique called Steiner 
symmetrization. If e is a unit vector in R”^, and B is a plane perpendicular to e, 
the Steiner symmetrization of B with respect to B is defined by 

S{E, e) = {x + te : X £ V, |t| < ^A(B; e; x)}, 

where L{E-, e;x) = m ({t £ R : a: -|- t • e £ B}), and m denotes the Lebesgue mea- 
sure. Note that a: -I- te £ S{E, e) if and only if a: — te £ 5(B, e). 

(b) Prove that S{E,e) is a bounded measurable subset of R"^ that satisfies 
m(5(B,e)) = m(B). 

[Hint: Use Fubini’s theorem.] 

(c) Show that diam S{E, e) < diam B. 

(d) If p is a rotation that leaves B and V invariant, show that pS{E,e) = 
S{E,e). 

(e) Finally, consider the standard basis {ei, . . . , Cd} of R"^. Let Bo = B, Bi = 
S{EQ,ei), Bo = S'(Bi,e 2 ), and so on. Use the fact that Ed is symmetric to 
prove the iso-diametric inequality. 
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(f) Use the iso-diametric inequality to show that m{E) = ^md{E) for any 
Borel set E in R**. 

3. Suppose 5 is a similarity. 

(a) Show that S maps a line segment to a line segment. 

(b) Show that if Li and L2 are two segments that make an angle a, then S{Li) 
and S'(I/2) make an angle a or —a. 

(c) Show that every similarity is a composition of a translation, a rotation 
(possibly improper), and a dilation. 

4. * The following gives a generalization of the construction of the Cantor-Lebesgne 
function. 

Let F be the compact set in Theorem 2.9 defined in terms of m similarities 
Si, S'2 , . . . , Sm with ratio 0 < r < 1. There exists a unique Borel measure /r sup- 
ported on F such that fi{F) — 1 and 



for any Borel set E. 


In the case when F is the Cantor set, the Cantor-Lebesgue function is /r([0,a;]). 

5. Prove a theorem of Hausdorff: Any compact subset K of R'^ is a continuous 
image of the Cantor set C. 

[Hint: Cover K by 2"^ (some ni) open balls of radius 1, say (with 

possible repetitions). Let Kj^ — K n Bj^ and cover each Kj^ with 2”^ balls of 
radius 1 /2 to obtain compact sets Kj^ 1 so on. Express t G C as a ternary 
expansion, and assign to f a unique point in K defined by the intersection n 
^ 31,32 n • • • for appropriate ji, jh, . . .. To prove continuity, observe that if two 
points in the Cantor set are close, then their ternary expansions agree to high 
order.] 

6. A compact subset K of R'^ is uniformly locally connected if given e > 0 
there exists <5 > 0 so that whenever x,y G K and \x — y\ <5, there is a continuous 
curve 7 in if joining x to y, such that 7 C B^{x) and 7 C Be(y). 

Using the previous problem, one can show that a compact subset K of R'* is 
the continuous image of the unit interval [0, 1] if and only if K is uniformly locally 
connected. 

7. Formulate and prove a generalization of Theorem 3.5 to the effect that once 
appropriate sets of measure zero are removed, there is a measure-preserving iso- 
morphism of the unit interval in R and the unit cube in R'*. 

8. * There exists a simple continuous curve in the plane of positive two-dimensional 


measure. 
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9. Let i? be a compact set in Show that dim(i? x 1) = dim(i7) + 1, where 

I is the unit interval in R. 

10. * Let Q be the Cantor set considered in Exercises 8 and 11. If ^ < 1/2, then 
Cf + ACj has positive Lebesgue measure for almost every A. 


Notes and References 


There are several excellent books that cover many of the subjects treated here. 
Among these texts are Riesz and Nagy [27], Wheeden and Zygmund [33], Tol- 
land [13], and Bruckner et al. [4]. 

Introduction 

The citation is a translation of a passage in a letter from Hermite to Stieltjes [18]. 
Chapter 1 

The citation is a translation from the French of a passage in [3] . 

We refer to Devlin [7] for more details about the axiom of choice, Hausdorff 
maximal principle, and well-ordering principle. 

See the expository paper of Gardner [14] for a survey of results regarding the 
Brunn-Minkowski inequality. 

Chapter 2 

The citation is a passage from the preface to the first edition of Lebesgue’s book 
on integration [20] . 

Devlin [7] contains a discussion of the continuum hypothesis. 

Chapter 3 

The citation is from Hardy and Littlewood’s paper [15]. 

Hardy and Littlewood proved Theorem 1.1 in the one- dimensional case by 
using the idea of rearrangements. The present form is due to Wiener. 

Our treatment of the isoperimetric inequality is based on Federer [11]. This 
work also contains significant generalizations and much additional material on 
geometric measure theory. 

A proof of the Besicovitch covering in the lemma in Problem 3* is in Mat- 
tila [22]. 

For an account of functions of bounded variations in see Evans and 
Gariepy [8]. 

An outline of the proof of Problem 7 (b)* can be found at the end of Chapter 5 
in Book I. 

The result in part (b) of Problem 8* is a theorem of S. Saks, and its proof as 
a consequence of part (a) can be found in Stein [31]. 

Chapter 4 

The citation is translated from the introduction of Plancherel’s article [25]. 

An account of the theory of almost periodic functions which is touched upon 
in Problem 2* can be found in Bohr [2]. 

The results in Problems 4* and 5* are in Zygmund [35], in Chapters V and VH, 
respectively. 

Consult Birkhoff and Rota [1] for more on Sturm-Liouville systems, Legendre 
polynomials, and Hermite functions. 

Chapter 5 
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See Courant [6] for an account of the Dirichlet principle and some of its applica- 
tions. The solution of the Dirichlet problem for general domains in and the 
related notion of logarithmic capacity of sets are treated in Ransford [26] . Tol- 
land [12] contains another solution to the Dirichlet problem (valid in R'*, d > 2) 
by methods which do not use the Dirichlet principle. 

The result regarding the existence of the conformal mapping stated in Prob- 
lem 3* is in Chapter VII of Zygmund [35] . 

Chapter 6 

The citation is a translation from the German of a passage in C. Caratheodory [5] . 

Petersen [24] gives a systematic presentation of ergodic theory, including a 
proof of the theorem in Problem 7* . 

The facts about spherical harmonics needed in Problem 4* can be found in 
Chapter 4 in Stein and Weiss [32]. 

We refer to Hardy and Wright [16] for an introduction to continued fractions. 
Their connection to ergodic theory is discussed in Ryll-Nardzewski [28]. 

Chapter 7 

The citation is a translation from the German of a passage in Hausdorff’s arti- 
cle [17], while Mandelbrot’s citation is from his book [21]. 

Mandelbrot’s book also contains many interesting examples of fractals arising 
in a variety of different settings, including a discussion of Richardson’s work on 
the length of coastlines. (See in particular Chapter 5.) 

Falconer [10] gives a systematic treatment of fractals and Hausdorff dimension. 
We refer to Sagan [29] for further details on space-filling curves, including the 
construction of a curve arising in Problem 8* . 

The monograph of Falconer [10] also contains an alternate construction of the 
Besicovitch set, as well as the fact that such sets must necessarily have dimension 
two. The particular Besicovitch set described in the text appears in Kahane [19], 
but the fact that it has measure zero required further ideas which are contained, 
for instance, in Peres et al. [30]. 

Regularity of sets in R'*, d > 3, and the estimates for the maximal function 
associated to the Radon transform are in Falconer [9], and Oberlin and Stein [23]. 

The theory of Besicovitch sets in higher dimensions, as well as a number of 
interesting related topics can be found in the survey of Wolff [34] . 
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Symbol Glossary 


The page numbers on the right indicate the first time the symbol or 
notation is defined or used. As usual, Z, Q, R, and C denote the integers, 
the rationals, the reals, and the complex numbers respectively. 


|x| (Euclidean) Norm of x 2 

E — F Complements and relative complements of 2 

sets 

d{E, F) Distance between two sets 2 

Br{x),Br{x) Open and closed balls 2 

E, dE Closure and boundary of E, respectively 3 

|i?| Volume of the rectangle R 3 

0{- ■ ■ ) 0 notation 12 

C, Cg, C Cantor sets 9, 38 

m^:{E) Exterior (Lebesgue) measure of the set E 10 

Ek yE E, Ek \ E Increasing and decreasing sequences of sets 20 

EAF Symmetric difference of E and F 21 

Eh = E + h Translation by h of the set E 22 

B^d Borel cr-algebra on 23 

Gs, Ffj Sets of type or Eg- 23 

J\f Non-measurable set 24 

a.e. Almost everywhere 30 

f^{x), f~{x) Positive and negative parts of / 31, 64 

A + B Sum of two sets 35 

Vd Volume of the unit ball in R*^ 39 

supp(/) Support of the function / 53 

fk yE f,fk\f Increasing and decreasing sequences of func- 62 

tions 

fh Translation by h of the function / 73 

L^(R‘^), Z/j(,j,(R‘^) Integrable and locally integrable functions 69, 105 

f * g Convolution of / and g 74 

p, fx, Ey, Ex Slices of the function / and set E 75 

/, E{f) Fourier transform of / 87, 208 

f* Maximal functions of / 100, 296 

Lp) Length of the (rectifiable) curve 7 115 

Tf,Pf, Ef Total, positive, and negative variations of F 117, 118 

L{A, B) Length of a curve between t = A and t = B 120 

D+(E), . . . , D-{F) Dini numbers of F 123 
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SYMBOL GLOSSARY 


M{K) 

Minkowski content of K 


138 

n+{6), n^{5) 

Outer and inner set of 0 


143 


Square integrable functions 


156 

£2(Z), ^2(N) 

Square summable sequences 


163 

n 

Hilbert space 


161 

f^g 

Orthogonal elements 


164 

e 

Unit disc 


173 


Hardy spaces 

174, 

213 


Orthogonal complement of S 


177 

A®B 

Direct sum of A and B 


177 

Ps 

Orthogonal projection onto S 


178 

T*, L* 

Adjoint of operators 

183, 

222 

S{R‘^) 

Schwartz space 


208 

c^m 

Smooth functions with compact support 
in 0 


222 

c^(n), c^(n) 

Functions with n continuous derivatives on 

0 and 0 


223 

Au 

Laplacian of u 


230 


Measure space 


263 

/i, fJjQ 

Measure, exterior measure, premeasure 263, 

264, 

270 

Ml X ^2 

Product measure 


276 

gd-l 

Unit sphere in R'^ 


279 

a, da{-^) 

Surface measure on the sphere 


280 

dF 

Lebesgue-Stieltjes measure 


282 

\l/\, Z/+, v~ 

Total, positive, and negative variations of i' 

286, 

287 

ly A ^ 

Mutually singular measures 


288 

V AC 

Absolutely continuous measures 


289 

a{S) 

Spectrum of S 


311 

ml{E) 

Exterior a-dimensional Hausdorff measure 


325 

diam S 

Diameter of S 


325 

dim E 

Hausdorff dimension of E 


329 

S 

Sierpinski triangle 


334 

A^B 

A comparable to B 


335 

/C, K,^ 

Von Koch curves 

338, 

340 

dist(A, B) 

Hausdorff distance 


345 

V{t) 

Peano mapping 


349 

Pt,! 

Hyperplane 


360 

P{f), Psif) 

Radon transform 

363, 

368 

n*{f), niu) 

Maximal Radon transform 

363, 

368 


Index 


Relevant items that also arose in Book I or Book II are listed in this 
index, preceeded by the numerals I or II, respectively. 


F^, 23 
Gs, 23 
(T-algebra 
Borel, 23 
of sets, 23 
Borel, 267 
(T-finite, 263 

cr-finite signed measure, 288 
O notation, 12 

absolute continuity 

of the Lebesgue integral, 66 
absolutely continuous 
functions, 127 
measures, 288 
adjoint, 183, 222 
algebra of sets, 270 
almost disjoint (union), 4 
almost everywhere, a.e., 30 
almost periodic function, 202 
approximation to the identity, 109; 
(1)49 

arc-length parametrization, 136; 
(1)103 

area of unit sphere, 313 
area under graph, 85 
averaging problem, 100 
axiom of choice, 26, 48 

basis 

algebraic, 202 
orthonormal, 164 
Bergman kernel, 254 
Besicovitch 

covering lemma, 153 
set, 360, 362, 374 
Bessel’s inequality, 166; (1)80 
Blaschke factors, 227; (1)26, 153, 
219 


Borel 

CT-algebra, 23, 267 
measure, 269 
on R, 281 
sets, 23, 267 

Borel-Cantelli lemma, 42, 63 
boundary, 3 

boundary- value function, 217 
bounded convergence theorem, 56 
bounded set, 3 
bounded variation, 116 
Brunn-Minkowski inequality, 34, 48 

canonical form, 50 
Cantor dust, 47, 343 
Cantor set, 8, 38, 126, 330, 387 
constant dissection, 38 
Cantor- Lebesgue 

function, 38, 126, 331, 387 
theorem, 95 

Caratheodory measurable, 264 
Cauchy 

in measure, 95 
integral, 179, 220; (11)48 
sequence, 159; (1)24; (11)24 
Cauchy-Schwarz inequality, 157, 
162; (1)72 
chain 

of dyadic squares, 352 
of quartic intervals, 351 
change of variable formula, 149; 
(1)292 

characteristic 
function, 27 
polynomial, 221, 258 
closed set, 2, 267; (11)6 
closure, 3 
coincidence, 377 
compact linear operator, 188 
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INDEX 


compact set, 3, 188; (11)6 
comparable, 335 
complement of a set, 2 
complete 
159 

measure space, 266 
mectric space, 69 
completion 

Borel (j-algebra, 23 
Hilbert space, 170; (1)74 
measure space, 312 
complex-valued function, 67 
conjugate Poisson kernel, 255 
continued fraction, 293, 322 
continuum hypothesis, 96 
contraction, 318 
convergence in measure, 96 
convex 

function, 153 
set, 35 

convolution, 74, 94, 253; (1)44, 139, 
239 

countable unions, 19 
counting measure, 263 
covering dimension, 383 
covering lemma 

Vitali, 102, 128, 152 
cube, 4 
curve 

closed and simple, 137; (1)102; 
( 11)20 
length, 115 
quasi-simple, 137, 332 
rectifiable, 115, 134, 332 
simple, 137, 332 
space-filling, 349, 383 
von Koch, 338, 340, 382 
cylinder set, 316 

d’Alembert’s formula, 224 
dense family of functions, 71 
difference set, 44 
differentiation of the integral, 99 
dimension 

Hausdorff, 329 
Minkowski, 333 
Dini numbers, 123 
Dirac delta function, 110, 285 
direct sum, 177 


Dirichlet 

integral, 230 
kernel, 179; (1)37 
principle, 229, 243 
problem, 230; (1)10, 28, 64, 170; 
(11)212, 216 
distance 

between two points, 2 
between two sets, 2, 267 
Hausdorff, 345 

dominated convergence theorem, 67 

doubling mapping, 304 

dyadic 

correspondence, 353 
induced mapping, 353 
rationale, 351 
square, 352 

Egorov’s theorem, 33 
eigenvalue, 186; (1)233 
eigenvector, 186 
equivalent functions, 69 
ergodic, (I) 111 

maximal theorem, 297 
mean theorem, 295 
measure- preserving 
transformation, 302 
pointwise theorem, 300 
extension principle, 183, 210 
exterior measure, 264 
Hausdorff, 325 
Lebesgue, 10 
metric, 267 

Fatou’s lemma, 61 
Fatou’s theorem, 173 
Fejer kernel, 112; (1)53, 163 
finite rank operator, 188 
finite-valued function, 27 
Fourier 

coefficient, 170; (1)16, 34 
inversion formula, 86; (1)141, 182; 
(11)115 

multiplier operator, 200, 220 
series, 171, 316; (1)34; (11)101 
transform in L^, 87 
transform in L^, 207, 211 
fractal, 329 

Fredholm alternative, 204 
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Fubini’s theorem, 75, 276 
function 

absolutely continuous, 127, 285 
almost periodic, 202 
boundary- value, 217 
bounded variation, 116, 154 
Cantor-Lebesgue, 126, 331 
characteristic, 27 
complex- valued, 67 
convex, 153 
Dirac delta, 110 
finite-valued, 27 
increasing, 117 
integrable, 59, 275 
jump, 132 

Lebesgue integrable, 59, 64, 68 
Lipschitz (Holder), 330; (1)43 
measurable, 28 
negative variation, 118 
normalized, 282 

nowhere differentiable, 154, 383 

positive variation, 118 

sawtooth, 200; (1)60, 83 

simple, 27, 50, 274 

slice, 75 

smooth, 222 

square integrable, 156 

step, 27 

strictly increasing, 117 
support, 53 
total variation, 117 
fundamental theorem of the 
calculus, 98 

Gaussian, 88; (1)135, 181 
good kernel, 88, 108; (1)48 
gradient, 236 

Gram-Schmidt process, 167 
Green’s 

formula, 313 
kernel, 204; (11)217 

Hardy space, 174, 203, 213 
harmonic function, 234; (1)20; (11)27 
Hausdorff 

dimension, 329 
distance, 345 
exterior measure, 325 
maximal principle, 48 


measure, 327 
strict dimension, 329 
heat kernel. 111; (1)120, 146, 209 
Heaviside function, 285 
Heine-Borel covering property, 3 
Hermite functions, 205; (1)168, 173 
Hermitian operator, 190 
Hilbert space, 161; (1)75 
156 

finite dimensional, 168 
infinite dimensional, 168 
orthonormal basis, 164 
Hilbert transform, 220, 255 
Hilbert-Schmidt operator, 187 
homogeneous set, 385 

identity operator, 180 
inequality 

Bessel, 166; (1)80 
Brunn-Minkowski, 34, 48 
Cauchy-Schwarz, 157, 162; (1)72 
iso-diametric, 328, 386 
isoperimetric, 143; (1)103 
triangle, 157, 162 
inner product, 157; (1)71 
integrable function, 59, 275 
integral operator, 187 
kernel, 187 
interior 
of a set, 3 
point, 3 

invariance of Lebesgue measure 
dilation, 22, 73 
linear transformation, 96 
rotation, 96, 151 
translation, 22, 73, 313 
invariant 

function, 302 
set, 302 
vectors, 295 

iso-diametric inequality, 328, 386 
isolated point, 3 
isometry, 198 

isoperimetric inequality, 143; (1)103, 
122 

jump 

discontinuity, 131; (1)63 
function, 132 
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Kakeya set, 362 
kernel 

Dirichlet, 179; (1)37 
Fejer, 112; (1)53 
heat, 111; (1)209 
Poisson, 111, 171, 217; (1)37, 55, 
149, 210; (11)67, 78, 109, 113, 
216 

Laplacian, 230 
Lebesgue 

decomposition, 150 
density, 106 
exterior measure, 10 
integrable function, 59, 64, 68 
integral, 50, 54, 58, 64 
measurable set, 16 
set, 106 

Lebesgue differentiation theorem, 
104, 121 

Lebesgue measure, 16 

dilation-invariance, 22, 73 
rotation-invariance, 96, 151 
translation-invariance, 22, 73, 313 
Lebesgue-Radon-Nikodym theorem, 
290 

Lebesgue-Stieltjes integral, 281 
Legendre polynomials, 205; (1)95 
limit 

non-tangential, 196 
point, 3 
radial, 173 
linear functional, 181 
null-space, 182 

linear operator (transformation), 

180 

adjoint, 183 
bounded, 180 
compact, 188 
continuous, 181 
diagonalized, 185 
finite rank, 188 
Hilbert-Schmidt, 187 
identity, 180 
invertible, 311 
norm, 180 
positive, 307 
spectrum, 311 
symmetric, 190 


linear ordering, 26, 48 
linearly independent 
elements, 167 
family, 167 

Lipschitz condition, 90, 147, 151, 
330, 362 

Littlewood’s principles, 33 
locally integrable function, 105 
Lusin’s theorem, 34 

maximal 

function, 100, 261 
theorem, 101, 297 
maximum principle, 235; (11)92 
mean-value property, 214, 234, 313; 

(1)152; (11)102 
measurable 

Caratheodory, 264 
function, 28, 273 
rectangle, 276 
set, 16, 264 
measure, 263 

absolutely continuous, 288 
counting, 263 
exterior, 264 
Hausdorff, 327 
Lebesgue, 16 
mutually singular, 288 
outer, 264 
signed, 285 
support, 288 
measure space, 263 
complete, 266 
measure-preserving 
isomorphism, 292 
transformation, 292 
Mellin transform, 253; (11)177 
metric, 267 

exterior measure, 267 
space, 266 
Minkowski 

content, 138, 151 
dimension, 333 
mixing, 305 

monotone convergence theorem, 62 
multiplication formula, 88 
multiplier, 220 
multiplier sequence, 186, 200 
mutually singular measures, 288 
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negative variation 
function, 118 
measure, 287 

non-measurable set, 24, 44, 82 
non-tangential limit, 196 
norm 

69 

157 

Euclidean, 2 
Hardy space, 174, 213 
linear operator, 180 
normal 

number, 318 
operator, 202 
normalized 

increasing function, 282 
nowhere differentiable function, 154, 
383; (1)113, 126 

open 

ball, 2, 267 
set, 2, 267 
ordered set 
linear, 26, 48 
partial, 48 
orthogonal 

complement, 177 
elements, 164 
projection, 178 
orthonormal 
basis, 164 
set, 164 
outer 

Jordan content, 41 
measure, 10, 264 
outside-triangle condition, 248 

Paley-Wiener theorem, 214, 259; 

(11)122 

parallelogram law, 176 
Parseval’s identity, 167, 172; (1)79 
partial differential operator 
constant coefficient, 221 
elliptic, 258 
partitions of a set, 286 
Peano 

curve, 350 
mapping, 350 
perfect set, 3 


perpendicular elements, 164 

Plancherel’s theorem, 208; (1)182 

plane, 360 

point in R'^, 2 

point of density, 106 

Poisson 

integral representation, 217; 

(1)57; (11)45, 67, 109 
kernel, 111, 171, 217; (1)37, 55, 
149, 210; (11)67, 78, 109, 113, 
216 

polar coordinates, 279; (1)179 
polarization, 168, 184 
positive variation 
function, 118 
measure, 287 

pre-Hilbert space, 169, 225; (1)75 

premeasure, 270 

product 

measure, 276 
sets, 83 

Pythagorean theorem, 164; (1)72 

quartic intervals, 351 
chain, 351 

quasi-simple curve, 332 
radial limit, 173 

Radon transform, 363; (1)200, 203 
maximal, 363 
rectangle, 3 

measurable, 276 
volume, 3 

rectifiable curve, 115, 134, 332 
refinement (of a partition), 116; 
(1)281, 290 

regularity of sets, 360 
regularization, 209 
Riemann integrable, 40, 47, 57; 
(1)31, 281, 290 

Riemann-Lebesgue lemma, 94 
Riesz representation theorem, 182, 
290 

Riesz-Fischer theorem, 70 
rising sun lemma, 121 
rotations of the circle, 303 

sawtooth function, 200; (1)60, 83 
self-adjoint operator, 190 
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self-similar, 342 

separable Hilbert space, 160, 162 
set 

bounded eccentricity, 108 
cylinder, 316 
difference, 44 
self-similar, 342 
shrink regularly, 108 
slice, 75 

uniformly locally connected, 387 
shift, 317 

Sierpinski triangle, 334 
signed measure, 285 
similarities 
separated, 346 
similarity, 342 
ratio, 342 
simple 

curve, 332 

function, 27, 50, 274 
slice, 361 
function, 75 
set, 75 

smooth function, 222 
Sobolev embedding, 257 
space of integrable functions, 68 
space-filling curve, 349, 383 
span, 167 

special triangle, 248 
spectral 
family, 306 
resolution, 306 
theorem, 190, 307; (1)233 
spectrum, 191, 311 
square integrable functions, 156 
Steiner symmetrization, 386 
step function, 27 
strong convergence, 198 
Sturm-Liouville, 185, 204 
subspace 


closed, 175 
linear, 174 
support 

function, 53 
measure, 288 
symmetric 
difference, 21 
linear operator, 184, 190 

Tchebychev inequality, 91 
Tietze extension principle, 246 
Tonelli’s theorem, 80 
total variation 
function, 117 
measure, 286 
translation, 73; (1)177 

continuity under, 74; (1)133 
triangle inequality, 157, 162, 267 

uniquely ergodic, 304 
unit disc, 173; (11)6 
unitary 

equivalence, 168 
isomorphism, 168 
mapping, 168; (1)143, 233 

Vitali covering, 102, 128, 152 
volume of unit ball, 92, 313; (1)208 
von Koch curve, 338, 340, 382 

weak 

convergence, 197, 198 
solution, 223 

weak-type inequality, 101, 146, 161 
weakly harmonic function, 234 
well ordering 
principle, 26, 48 
well-ordered set, 26 
Wronskian, 204 


